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This book is dedicated, 
in respect and admiration, 
to 



The Principle of Least Action. 



“The author has spared himself no pains in his endeavour to 
present the main ideas in the simplest and most intelligible form, 
and on the whole, in the sequence and connection in which they 
actually originated. In the interest of clearness, it appeared to 
me inevitable that I should repeat myself frequently, without pay- 
ing the slightest attention to the elegance of the presentation. I 
adhered scrupulously to the precept of that brilliant theoretical 
physicist L. Boltzmann, according to whom matters of elegance 
ought be left to the tailor and to the cobbler.” 

Albert Einstein, in Relativity, the Special and General Theory, 
(1961), p. v. 
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Preface 



“In almost all textbooks, even the best, this 
principle is presented so that it is impossible to 
understand.” (K. Jacobi Lectures on Dynamics, 
1842-1843). I have not chosen to break with 
tradition. 

V.I. Arnold, Mathematical Methods of Classical 
Mechanics (1980), footnote on p. 246 



There has been a remarkable revival of interest in classical me- 
chanics in recent years. We now know that there is much more 
to classical mechanics than previously suspected. The behavior of 
classical systems is surprisingly rich; derivation of the equations of 
motion, the focus of traditional presentations of mechanics, is just 
the beginning. Classical systems display a complicated array of 
phenomena such as non-linear resonances, chaotic behavior, and 
transitions to chaos. 

Traditional treatments of mechanics concentrate most of their 
effort on the extremely small class of symbolically tractable dy- 
namical systems. We concentrate on developing general methods 
for studying the behavior of systems, whether or not they have 
a symbolic solution. Typical systems exhibit behavior that is 
qualitatively different from the solvable systems and surprisingly 
complicated. We focus on the phenomena of motion, and we make 
extensive use of computer simulation to explore this motion. 

Even when a system is not symbolically tractable the tools of 
modern dynamics allow one to extract a qualitative understand- 
ing. Rather than concentrating on symbolic descriptions, we con- 
centrate on geometric features of the set of possible trajectories. 
Such tools provide a basis for the systematic analysis of numerical 
or experimental data. 

Classical mechanics is deceptively simple. It is surprisingly easy 
to get the right answer with fallacious reasoning or without real 
understanding. Traditional mathematical notation contributes 
to this problem. Symbols have ambiguous meanings, which de- 
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pend on context, and often even change within a given context . 1 
For example, a fundamental result of mechanics is the Lagrange 
equations. Using traditional notation the Lagrange equations are 
written 

d dL dL 

dt dq 1 dq l 

The Lagrangian L must be interpreted as a function of the position 
and velocity components q l and c/L so that the partial deriva- 
tives make sense, but then in order for the time derivative d/dt 
to make sense solution paths must have been inserted into the 
partial derivatives of the Lagrangian to make functions of time. 
The traditional use of ambiguous notation is convenient in simple 
situations, but in more complicated situations it can be a serious 
handicap to clear reasoning. In order that the reasoning be clear 
and unambiguous, we have adopted a more precise mathematical 
notation. Our notation is functional and follows that of modern 
mathematical presentations . 2 

Computation also enters into the presentation of the mathe- 
matical ideas underlying mechanics. We require that our mathe- 
matical notations be explicit and precise enough so that they can 



1 In his book on mathematical pedagogy [15], Hans Freudenthal argues that 
the reliance on ambiguous, unstated notational conventions in such expressions 
as f(x) and df(x)/dx makes mathematics, and especially introductory calcu- 
lus, extremely confusing for beginning students; and he enjoins mathematics 
educators to use more formal modern notation. 

2 In his beautiful book Calculus on Manifolds (1965), Michael Spivak uses 
functional notation. On p.44 he discusses some of the problems with classical 
notation. We excerpt a particularly juicy quote: 

The mere statement of [the chain rule] in classical notation requires the 
introduction of irrelevant letters. The usual evaluation for Di(fo (g, h)) 
runs as follows: 

If f(u, v) is a function and u = g(x, y) and v = h(x, y) then 

df(g(x,y),h(x,y)) = df(u, v) du df(u,v) dv 

dx du dx dv dx 

[The symbol du/dx means d/dx g(x,y), and d/du f(u,v) means 
Dif(u, v) = D\f (g(x, y), h(x, j/)).] This equation is often written simply 

df _ df du df dv 

dx du dx dv dx' 

Note that / means something different on the two sides of the equation! 
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be interpreted automatically, as by a computer. As a consequence 
of this requirement the formulas and equations that appear in the 
text stand on their own. They have clear meaning, independent of 
the informal context. For example, we write Lagrange’s equations 
in functional notation as follows: 3 

D(d 2 L oT[q}) - diL oT[q] = 0 

The Lagrangian L is a real-valued function of time t, coordinates 
x , and velocities v; the value is L(t,x,v). Partial derivatives 
are indicated as derivatives of functions with respect to partic- 
ular argument positions; d 2 L indicates the function obtained by 
taking the partial derivative of the Lagrangian function L with 
respect to the velocity argument position. The traditional partial 
derivative notation, which employs a derivative with respect to a 
“variable,” depends on context and can lead to ambiguity. 4 The 
partial derivatives of the Lagrangian are then explicitly evaluated 
along a path function q. The time derivative is taken and the 
Lagrange equations formed. Each step is explicit; there are no 
implicit substitutions. 

Computational algorithms are used to communicate precisely 
some of the methods used in the analysis of dynamical phenomena. 
Expressing the methods of variational mechanics in a computer 
language forces them to be unambiguous and computationally 
effective. Computation requires us to be precise about the repre- 
sentation of mechanical and geometric notions as computational 
objects and permits us to represent explicitly the algorithms for 
manipulating these objects. Also, once formalized as a procedure, 
a mathematical idea becomes a tool that can be used directly to 
compute results. 

Active exploration on the part of the student is an essential 
part of the learning experience. Our focus is on understanding 
the motion of systems; to learn about motion the student must 
actively explore the motion of systems through simulation and 



3 This is presented here without explanation, to give the flavor of the notation. 
The text gives a full explanation. 

4 “It is necessary to use the apparatus of partial derivatives, in which even the 
notation is ambiguous.” From V.I. Arnold, Mathematical Methods of Classical 
Mechanics (1980), Section 47, p258. See also the footnote on that page. 
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experiment. The exercises and projects are an integral part of the 
presentation. 

That the mathematics is precise enough to be interpreted au- 
tomatically allows active exploration to be extended to the math- 
ematics. The requirement that the computer be able to inter- 
pret any expression provides strict and immediate feedback as 
to whether the expression is correctly formulated. Experience 
demonstrates that interaction with the computer in this way un- 
covers and corrects many deficiencies in understanding. 

This book presents classical mechanics from an unusual per- 
spective. It focuses on understanding motion rather than deriving 
equations of motion. It weaves recent discoveries of nonlinear dy- 
namics throughout the presentation, rather than presenting them 
as an afterthought. It uses functional mathematical notation that 
allows precise understanding of fundamental properties of classical 
mechanics. It uses computation to constrain notation, to capture 
and formalize methods, for simulation, and for symbolic analysis. 

This book is the result of teaching classical mechanics at MIT 
for the past six years. The contents of our class began with ideas 
from a class on nonlinear dynamics and solar system dynamics by 
Wisdom and ideas about how computation can be used to formu- 
late methodology developed in the introductory computer science 
class by Abelson and Sussman. When we started we expected that 
using this approach to formulate mechanics would be easy. We 
quickly learned though that there were many things we thought we 
understood that we did not in fact understand. Our requirement 
that our mathematical notations be explicit and precise enough 
so that they can be interpreted automatically, as by a computer, 
is very effective in uncovering puns and flaws in reasoning. The 
resulting struggle to make the mathematics precise, yet clear and 
computationally effective, lasted far longer than we anticipated. 
We learned a great deal about both mechanics and computation 
by this process. We hope others, especially our competitors, will 
adopt these methods that enhance understanding, while slowing 
research. 
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Lagrangian Mechanics 



The purpose of mechanics is to describe how 
bodies change their position in space with “time.” 

I should load my conscience with grave sins against 
the sacred spirit of lucidity were I to formulate the 
aims of mechanics in this way, without serious 
reflection and detailed explanations. Let us 
proceed to disclose these sins. 

Albert Einstein Relativity, the Special and General 
Theory, (1961), p. 9. 



The subject of this book is motion, and the mathematical tools 
used to describe it. 

Centuries of careful observations of the motions of the planets 
revealed regularities in those motions, allowing accurate predic- 
tions of phenomena such as eclipses and conjunctions. The effort 
to formulate these regularities and ultimately to understand them 
led to the development of mathematics and to the discovery that 
mathematics could be effectively used to describe aspects of the 
physical world. That mathematics can be used to describe natural 
phenomena is a remarkable fact. 

When a juggler throws a pin it takes a rather predictable path 
and it rotates in a rather predictable way. In fact, the skill of jug- 
gling depends crucially on this predictability. It is also a remark- 
able discovery that the same mathematical tools used to describe 
the motions of the planets can be used to describe the motion of 
the juggling pin. 

Classical mechanics describes the motion of a system of par- 
ticles, subject to forces describing their interactions. Complex 
physical objects, such as juggling pins, can be modeled as myriad 
particles with fixed spatial relationships maintained by stiff forces 
of interaction. 

There are many conceivable ways a system could move that 
never occur. We can imagine that the juggling pin might pause 
in midair or go fourteen times around the head of the juggler be- 
fore being caught, but these motions do not happen. How can 
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we distinguish motions of a system that can actually occur from 
other conceivable motions? Perhaps we can invent some mathe- 
matical function that allows us to distinguish realizable motions 
from among all conceivable motions. 

The motion of a system can be described by giving the position 
of every piece of the system at each moment. Such a description of 
the motion of the system is called a configuration path ; the config- 
uration path specifies the configuration as a function of time. The 
juggling pin rotates as it flies through the air; the configuration of 
the juggling pin is specified by giving the position and orientation 
of the pin. The motion of the juggling pin is specified by giving 
the position and orientation of the pin as a function of time. 

The function that we seek takes a configuration path as an 
input and produces some output. We want this function to have 
some characteristic behavior when the input is a realizable path. 
For example, the output could be a number, and we could try to 
arrange that the number is zero only on realizable paths. Newton’s 
equations of motion are of this form; at each moment Newton’s 
differential equations must be satisfied. 

However, there is a alternate strategy that provides more in- 
sight and power: we could look for a path-distinguishing function 
that has a minimum on the realizable paths — on nearby unreal- 
izable paths the value of the function is higher than it is on the 
realizable path. This is the variational strategy : for each physical 
system we invent a path-distinguishing function that distinguishes 
realizable motions of the system by having a stationary point for 
each realizable path. 1 For a great variety of systems realizable 
motions of the system can be formulated in terms of a variational 
principle. 2 



1 A stationary point of a function is a point where the function’s value does not 
vary as the input is varied. Local maxima or minima are stationary points. 

2 The variational formulation successfully describes all of the Newtonian me- 
chanics of particles and rigid bodies. The variational formulation has also 
been usefully applied in the description of many other systems such as classi- 
cal electrodynamics, the dynamics of inviscid fluids, and the design of mech- 
anisms such as four-bar linkages. In addition, modern formulations of quan- 
tum mechanics and quantum field theory build on many of the same con- 
cepts. However, the variational formulation does not appear to apply to all 
dynamical systems. For example, there is no simple prescription to apply 
the variational apparatus to systems with dissipation, though in special cases 
variational methods still apply. 
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Mechanics, as invented by Newton and his contemporaries, de- 
scribes the motion of a system in terms of the positions, velocities, 
and accelerations of each of the particles in the system. In contrast 
to the Newtonian formulation of mechanics, the variational formu- 
lation of mechanics describes the motion of a system in terms of 
aggregate quantities that are associated with the motion of the 
system as a whole. 

In the Newtonian formulation the forces can often be written 
as derivatives of the potential energy of the system. The motion 
of the system is determined by considering how the individual 
component particles respond to these forces. The Newtonian for- 
mulation of the equations of motion is intrinsically a particle-by- 
particle description. 

In the variational formulation the equations of motion are for- 
mulated in terms of the difference of the kinetic energy and the 
potential energy. The potential energy is a number that is char- 
acteristic of the arrangement of the particles in the system; the 
kinetic energy is a number that is determined by the velocities of 
the particles in the system. Neither the potential energy nor the 
kinetic energy depend on how those positions and velocities are 
specified. The difference is characteristic of the system as a whole 
and does not depend on the details of how the system is specified. 
So we are free to choose ways of describing the system that are 
easy to work with; we are liberated from the particle-by-particle 
description inherent in the Newtonian formulation. 

The variational formulation has numerous advantages over the 
Newtonian formulation. The equations of motion for those param- 
eters that describe the state of the system are derived in the same 
way regardless of the choice of those parameters: the method of 
formulation does not depend on the choice of coordinate system. 
If there are positional constraints among the particles of a system 
the Newtonian formulation requires that we consider the forces 
maintaining these constraints, whereas in the variational formu- 
lation the constraints can be built into the coordinates. The vari- 
ational formulation reveals the association of conservation laws 
with symmetries. The variational formulation provides a frame- 
work for placing any particular motion of a system in the context 
of all possible motions of the system. We pursue the variational 
formulation because of these advantages. 




4 



Chapter 1 Lagrangian Mechanics 



1.1 The Principle of Stationary Action 

Let us suppose that for each physical system there is a path- 
distinguishing function that is stationary on realizable paths. We 
will try to deduce some of its properties. 

Experience of motion 

Our ordinary experience suggests that physical motion can be de- 
scribed by configuration paths that are continuous and smooth . 3 
We do not see the juggling pin jump from one place to another. 
Nor do we see the juggling pin suddenly change the way it is mov- 
ing. 

Our ordinary experience suggests that the motion of physical 
systems does not depend upon the entire history of the system. 
If we enter the room after the juggling pin has been thrown into 
the air we cannot tell when it left the juggler’s hand. The juggler 
could have thrown the pin from a variety of places at a variety 
of times with the same apparent result as we walk in the door . 4 
So the motion of the pin does not depend on the details of the 
history. 

Our ordinary experience suggests that the motion of physical 
systems is deterministic. In fact, a small number of parameters 
summarize the important aspects of the history of the system and 
determine the future evolution of the system. For example, at 
any moment the position, velocity, orientation and rate of change 
of the orientation of the juggling pin are enough to completely 
determine the future motion of the pin. 

Realizable paths 

From our experience of motion we develop certain expectations 
about realizable configuration paths. If a path is realizable, then 
any segment of the path is a realizable path segment. Conversely, 
a path is realizable if every segment of the path is a realizable 



3 Experience with systems on an atomic scale suggests that at this scale systems 
do not travel along well-defined configuration paths. To describe the evolution 
of systems on the atomic scale we employ quantum mechanics. Here, we 
restrict attention to systems for which the motion is well described by a smooth 
configuration path. 

Extrapolation of the orbit of the Moon backward in time cannot determine 
the point at which the Moon was placed on this trajectory. To determine 
the origin of the Moon we must supplement dynamical evidence with other 
physical evidence such as chemical compositions. 
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path segment. The realizability of a path segment depends on 
all points of the path in the segment. The realizability of a path 
segment depends on every point of the path segment in the same 
way; no part of the path is special. The realizability of a path 
segment depends only on points of the path within the segment; 
the realizability of a path segment is a local property. 

So the path-distinguishing function aggregates some local prop- 
erty of the system measured at each moment along the path seg- 
ment. Each moment along the path must be treated the same way. 
The contributions from each moment along the path segment must 
be combined in a way that maintains the independence of the con- 
tributions from disjoint subsegments. One method of combination 
that satisfies these requirements is to add up the contributions, 
making the path-distinguishing function an integral over the path 
segment of some local property of the path. 5 

So we will try to arrange that the path-distinguishing func- 
tion, constructed as an integral of a local property along the path, 
assumes an extreme value for any realizable path. Such a path- 
distinguishing function is traditionally called an action for the 
system. We use the word “action” to be consistent with common 
usage. Perhaps it would be clearer to continue to call it “path- 
distinguishing function,” but then it would be more difficult for 
others to know what we were talking about. 6 

In order to pursue the agenda of variational mechanics, we must 
invent action functions that are stationary on the realizable tra- 
jectories of the systems we are studying. We will consider actions 
that are integrals of some local property of the configuration path 
at each moment. Let 7 be the configuration-path function; 7 (t) 



5 We suspect that this argument can be promoted to a precise constraint on 
the possible ways of making this path-distinguishing function. 

6 Historically, Huygens was the first to use the term “action” in mechanics. He 
used the term to refer to “the effect of a motion.” This is an idea that came 
from the Greeks. In his manuscript “Dynamica” (1690) Leibnitz enunciated a 
“Least Action Principle” using the “harmless action,” which was the product 
of mass, velocity, and the distance of the motion. Leibnitz also spoke of a 
“violent action” in the case where things collided. 
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is the configuration at time t. The action of the segment of the 
path 7 in the time interval from t\ to t-2 is 7 

«5[7](<i,fe) = [ F[l\ (1-1) 

Jtx 

where J~Yi\ is a function of time that measures some local property 
of the path. It may depend upon the value of the function 7 at 
that time and the value of any derivatives of 7 at that time. 8 

The configuration path can be locally described at a moment in 
terms of the configuration, the rate of change of the configuration, 
and all the higher derivatives of the configuration at the given 
moment. Given this information the path can be reconstructed in 
some interval containing that moment. 9 Local properties of paths 
can depend on no more than the local description of the path. 

The function T measures some local property of the configura- 
tion path 7. We can decompose !F\i\ into two parts: a part that 
measures some property of a local description and a part that ex- 
tracts a local description of the path from the path function. The 
function that measures the local property of the system depends 
on the particular physical system; the method of construction of a 
local description of a path from a path is the same for any system. 
We can write ^"[7] as a composition of these two functions: 10 

F[j\ = CoT[j\. (1.2) 



7 A definite integral of a real-valued function / of a real argument is written 
fa f. This can also be written f(x)dx. The first notation emphasizes that 
a function is being integrated. 

traditionally, square brackets are put around functional arguments. In this 
case, the square brackets remind us that the value of S may depend on the 
function 7 in complicated ways, such as through its derivatives. 

9 In the case of a real-valued function the value of the function and its deriva- 
tives at some point can be used to construct a power series. For sufficiently 
nice functions (real analytic) the power series constructed in this way con- 
verges in some interval containing the point. Not all functions can be locally 
represented in this way. For example, the function f(x ) = exp(— 1 /x 2 ), with 
/( 0) = 0, is zero and has all derivatives zero at x = 0, but this infinite number 
of derivatives is insufficient to determine the function value at any other point. 

10 Here o denotes composition of functions: ( f°g){t ) = /(<?(!))• In our notation 
the application of a path-dependent function to its path is of higher precedence 
than the composition, so C o T[ 7] = C o (T[7j). 
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The function T takes the path and produces a function of time. 
Its value is an ordered tuple containing the time, the configuration 
at that time, the rate of change of the configuration at that time, 
and the values of higher derivatives of the path evaluated at that 
time. For the path 7 and time t : 11 

(1.3) 

We refer to this tuple, which includes as many derivatives as are 
needed, as the local tuple. 

The function C depends on the specific details of the physical 
system being investigated, but does not depend on any particular 
configuration path. The function C computes a real-valued local 
property of the path. We will find that C needs only a finite num- 
ber of components of the local tuple to compute this property: 
The path can be locally reconstructed from the full local descrip- 
tion; that C depends on a finite number of components of the local 
tuple guarantees that it measures a local property . 12 

The advantage of this decomposition is that the local descrip- 
tion of the path is computed by a uniform process from the con- 
figuration path, independent of the system being considered. All 
of the system-specific information is captured in the function C. 

The function C is called a Lagrangian 13 for the system, and the 
resulting action, 

= f £°T[7 ], (1.4) 



n The derivative ©7 of a configuration path 7 can be defined in terms of 
ordinary derivatives by specifying how it acts on sufficiently smooth real- 
valued functions / of configurations. The exact definition is unimportant at 
this stage. If you are curious see footnote 23. 

12 We will later discover that an initial segment of the local tuple will be 
sufficient to determine the future evolution of the system. That a configuration 
and a finite number of derivatives determines the future means that there is 
a way of determining all of the rest of the derivatives of the path from the 
initial segment. 

13 The classical Lagrangian plays a fundamental role in the path-integral for- 
mulation of quantum mechanics (due to Dirac and Feynman), where the com- 
plex exponential of the classical action yields the relative probability ampli- 
tude for a path. The Lagrangian is the starting point for the Hamiltonian 
formulation of mechanics (discussed in chapter 3), which is also essential in 
the Schrodinger and Heisenberg formulations of quantum mechanics and in 
the Boltzmann-Gibbs approach to statistical mechanics. 
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is called the Lagrangian action. Lagrangians can be found for a 
great variety of systems. We will see that for many systems the 
Lagrangian can be taken to be the difference between kinetic and 
potential energy. Such Lagrangians depend only on the time, the 
configuration, and the rate of change of the configuration. We will 
focus on this class of systems, but will also consider more general 
systems from time to time. 

A realizable path of the system is to be distinguished from oth- 
ers by having stationary action with respect to some set of nearby 
unrealizable paths. Now some paths near realizable paths will 
also be realizable: for any motion of the juggling pin there is an- 
other that is slightly different. So when addressing the question 
of whether the action is stationary with respect to variations of 
the path we must somehow restrict the set of paths we are con- 
sidering to contain only one realizable path. It will turn out that 
for Lagrangians that depend only on the configuration and rate 
of change of configuration it is enough to restrict the set of paths 
to those that have the same configuration at the endpoints of the 
path segment. 

The Principle of Stationary Action 14 asserts that for each dy- 
namical system we can cook up a Lagrangian such that a realizable 
path connecting the configurations at two times t\ and t^ is dis- 
tinguished from all conceivable paths by the fact that the action 
5[7](ti , t-f) is stationary with respect to variations of the path. 
For Lagrangians that depend only on the configuration and rate 
of change of configuration the variations are restricted to those 
that preserve the configurations at t\ and t- 2 - 15 



14 The principle is often called the “Principle of Least Action” because its 
initial formulations spoke in terms of the action being minimized rather than 
the more general case of taking on a stationary value. The term “Principle of 
Least Action” is also commonly used to refer to a result, due to Maupertuis, 
Euler, and Lagrange, which says that free particles move along paths for which 
the integral of the kinetic energy is minimized among all paths with the given 
endpoints. Correspondingly, the term “action” is sometimes used to refer 
specifically to the integral of the kinetic energy. (Actually, Euler and Lagrange 
used the vis viva, or twice the kinetic energy.) 

15 Other ways of stating the principle of stationary action make it sound teleo- 
logical and mysterious. For instance, one could imagine that the system con- 
siders all possible paths from its initial configuration to its final configuration 
and then chooses the one with the smallest action. Indeed, the underlying vi- 
sion of a purposeful, economical, and rational universe played no small part in 
the philosophical considerations that accompanied the initial development of 
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Exercise 1.1: Fermat optics 

Fermat observed that the laws of reflection and refraction could be ac- 
counted for by the following facts: Light travels in a straight line in any 
particular medium with a velocity that depends upon the medium. The 
path taken by a ray from a source to a destination through any sequence 
of media is a path of least total time, compared to neighboring paths. 
Show that these facts do imply the laws of reflection and refraction. 16 



1.2 Configuration Spaces 

Let us consider mechanical systems that can be thought of as 
composed of constituent point particles, with mass and position, 
but with no internal structure. 17 Extended bodies may be thought 
of as composed of a large number of these constituent particles 
with specific spatial relationships between them. Extended bodies 
maintain their shape because of spatial constraints between the 
constituent particles. Specifying the position of all the constituent 
particles of a system specifies the configuration of the system. The 
existence of constraints between parts of the system, such as those 
that determine the shape of an extended body, means that the 
constituent particles cannot assume all possible positions. The 
set of all configurations of the system that can be assumed is 
called the configuration space of the system. The dimension of the 



mechanics. The earliest action principle that remains part of modern physics is 
Fermat’s Principle, which states that the path traveled by a light ray between 
two points is the path that takes the least amount of time. Fermat formu- 
lated this principle around 1660 and used it to derive the laws of reflection 
and refraction. Motivated by this, the French mathematician and astronomer 
Pierre-Louis Moreau de Maupertuis enunciated the Principle of Least Action 
as a grand unifying principle in physics. In his Essai de cosmologie (1750) 
Maupertuis appealed to this principle of “economy in nature” as evidence of 
the existence of God, asserting that it demonstrated “God’s intention to regu- 
late physical phenomena by a general principle of the highest perfection.” For 
a historical perspective of Maupertuis’s, Euler’s, and Lagrange’s roles in the 
formulation of the principle of least action, see Jourdain [25]. 

16 For reflection the angle of incidence is equal to the angle of reflection. Re- 
fraction is described by Snell’s law. Snell’s Law is that when light passes from 
one medium to another, the ratio of the sines of the angles made to the normal 
to the interface is the inverse of the ratio of the refractive indices of the media. 
The refractive index is the ratio of the speed of light in the vacuum to the 
speed of light in the medium. 

17 We often refer to a point particle with mass but no internal structure as a 
point mass. 
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configuration space is the smallest number of parameters that have 
to be given to completely specify a configuration. The dimension 
of the configuration space is also called the number of degrees of 
freedom of the system. 18 

For a single unconstrained particle it takes three parameters to 
specify the configuration. Thus the configuration space of a point 
particle is three dimensional. If we are dealing with a system with 
more than one point particle, the configuration space is more com- 
plicated. If there are k separate particles we need 3k parameters 
to describe the possible configurations. If there are constraints 
among the parts of a system the configuration is restricted to a 
lower-dimensional space. For example, a system consisting of two 
point particles constrained to move in three dimensions so that the 
distance between the particles remains fixed has a five-dimensional 
configuration space: for example, with three numbers we can fix 
the position of one particle, and with two others we can give the 
position of the other particle relative to the first. 

Consider a juggling pin. The configuration of the pin is specified 
if we give the positions of every atom making up the pin. However, 
there exist more economical descriptions of the configuration. In 
the idealization that the juggling pin is truly rigid, the distances 
among all the atoms of the pin remain constant. So we can specify 
the configuration of the pin by giving the position of a single atom 
and the orientation of the pin. Using the constraints, the positions 
of all the other constituents of the pin can be determined from 
this information. The dimension of the configuration space of 
the juggling pin is six: the minimum number of parameters that 
specify the position in space is three, and the minimum number 
of parameters that specify an orientation is also three. 

As a system evolves with time, the constituent particles move 
subject to the constraints. The motion of each constituent particle 



18 Strictly speaking the dimension of the configuration space and the number 
of degrees of freedom are not the same. The number of degrees of freedom is 
the dimension of the space of configurations that are “locally accessible.” For 
systems with integrable constraints the two are the same. For systems with 
non-integrable constraints the configuration dimension can be larger than the 
number of degrees of freedom. For further explanation see the discussion of 
systems with non-integrable constraints below (section 1.10.3). Apart from 
that discussion, all of the systems we will consider have integrable constraints 
(they are “holonomic”). This is why we have chosen to blur the distinction be- 
tween the number of degrees of freedom and the dimension of the configuration 
space. 
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is specified by describing the changing configuration. Thus, the 
motion of the system may be described as evolving along a path 
in configuration space. The configuration path may be specified 
by a function, the configuration-path function, which gives the 
configuration of the system at any time. 

Exercise 1.2: Degrees of freedom 

For each of the mechanical systems described below, give the number of 
degrees of freedom of the configuration space. 

a. Three juggling pins. 

b. A spherical pendulum, consisting of a point mass hanging from a 
rigid massless rod attached to a fixed support point. The pendulum 
bob may move in any direction subject to the constraint imposed by the 
rigid rod. The point mass is subject to the uniform force of gravity. 

c. A spherical double pendulum, consisting of one point-mass hanging 
from a rigid massless rod attached to a second point-mass hanging from 
a second massless rod attached to a fixed support point. The point mass 
is subject to the uniform force of gravity. 

d. A point mass sliding without friction on a rigid curved wire. 

e. A top consisting of a rigid axisymmetric body with one point on the 
symmetry axis of the body attached to a fixed support, subject to a 
uniform gravitational force. 

f. The same as e, but not axisymmetric. 



1.3 Generalized Coordinates 

In order to be able to talk about specific configurations we need to 
have a set of parameters that label the configurations. The param- 
eters that are used to specify the configuration of the system are 
called the generalized coordinates. Consider an unconstrained free 
particle. The configuration of the particle is specified by giving 
its position. This requires three parameters. The unconstrained 
particle has three degrees of freedom. One way to specify the po- 
sition of a particle is to specify its rectangular coordinates relative 
to some chosen coordinate axes. The rectangular components of 
the position are generalized coordinates for an unconstrained par- 
ticle. Or consider an ideal planar double pendulum: a point mass 
constrained to always be a given distance from a fixed point by a 
rigid rod, with a second mass that is constrained to be at a given 
distance from the first mass by another rigid rod, all confined to a 




12 



Chapter 1 Lagrangian Mechanics 



vertical plane. The configuration is specified if the orientation of 
the two rods is given. This requires at least two parameters; the 
planar double pendulum has two degrees of freedom. One way to 
specify the orientation of each rod is to specify the angle it makes 
with the vertical. These two angles are generalized coordinates 
for the planar double pendulum. 

The number of coordinates need not be the same as the dimen- 
sion of the configuration space, though there must be at least that 
many. We may choose to work with more parameters than neces- 
sary, but then the parameters will be subject to constraints that 
restrict the system to possible configurations, that is, to elements 
of the configuration space. 

For the planar double pendulum described above, the two angle 
coordinates are enough to specify the configuration. We could 
also take as generalized coordinates the rectangular coordinates of 
each of the masses in the plane, relative to some chosen coordinate 
axes. These are also fine coordinates, but we will have to explicitly 
keep in mind the constraints that limit the possible configurations 
to the actual geometry of the system. Sets of coordinates with 
the same dimension as the configuration space are easier to work 
with because we do not have to deal with explicit constraints 
among the coordinates. So for the time being we will consider 
only formulations where the number of configuration coordinates 
is equal to the number of degrees of freedom; later we will learn 
how to handle systems with redundant coordinates and explicit 
constraints. 

In general, the configurations form a space M of some dimen- 
sion n. The n-dimensional configuration space can be parametrized 
by choosing a coordinate function x that maps elements of the 
configuration space to n-tuples of real numbers. If there is more 
than one dimension, the function x is a tuple of n independent 
coordinate functions 19 x\ * = 0,...,n — 1, where each x* is a 
real-valued function defined on some region of the configuration 
space. 20 For a given configuration m in the configuration space M 



19 A tuple of functions that all have the same domain is itself a function on 
that domain: Given a point in the domain the value of the tuple of functions 
is a tuple of the values of the component functions at that point. 

20 The use of superscripts to index the coordinate components is traditional, 
even though there is potential confusion, say, with exponents. We use zero- 
based indexing. 
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the values y*(m) of the coordinate functions are the generalized 
coordinates of the configuration. These generalized coordinates 
permit us to identify points of the n-dimensional configuration 
space with n-tuples of real numbers. 21 For any given configura- 
tion space, there are a great variety of ways to choose generalized 
coordinates. Even for a single point moving without constraints, 
we can choose rectangular coordinates, polar coordinates, or any 
other coordinate system that strikes our fancy. 

The motion of the system can be described by a configuration 
path 7 mapping time to configuration-space points. Correspond- 
ing to the configuration path is a coordinate path q = x° 7 mapping 
time to tuples of generalized coordinates. If there is more than 
one degree of freedom the coordinate path is a structured object: 
q is a tuple of component coordinate path functions q l = y* o 7 . 
At each instant of time t, the values q(t) = ( q°(t ), . . . , g n_1 (t)) are 
the generalized coordinates of a configuration. 

The derivative Dq of the coordinate path q is a function 22 that 
gives the rate of change of the configuration coordinates at a given 
time: Dq(t) = (Dq°(t ), . . . , Dq n ~ l {t)). The rate of change of a 
generalized coordinate is called a generalized velocity. 

We can make coordinate representations for higher derivatives 
of the path as well. We introduce the function IE (pronounced 



21 More precisely, the generalized coordinates identify open subsets of the con- 
figuration space with open subsets of R". It may require more than one set of 
generalized coordinates to cover the entire configuration space. For example, 
if the configuration space is a two-dimensional sphere, we could have one set 
of coordinates that maps (a little more than) the northern hemisphere to a 
disk, and another set that maps (a little more than) the southern hemisphere 
to a disk, with a strip near the equator common to both coordinate systems. 
A space that can be locally parametrized by smooth coordinate functions is 
called a differentiable manifold. The theory of differentiable manifolds can be 
used to formulate a coordinate-free treatment of variational mechanics. An 
introduction to mechanics from this perspective can be found in [2] or [5] . 

22 The derivative of a function / is a function. It is denoted Df. Our notational 
convention is that D is a high-precedence operator. Thus D operates on the 
adjacent function before any other application occurs: Df(x) is the same as 
(Df)(x). 
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“chart” ) that extends a coordinate representation to the local tu- 
ple: 23 

7 (f), T>i{t), ...) = (t, q(t), Dq(t ), . . .) , (1.5) 

where q = y ° 7- The function 10 x takes the coordinate- free local 
tuple (t,'y(t),V'y(t ), . . .) and gives a coordinate representation as 
a tuple of the time, the value of the coordinate path function at 
that time, and the values of as many derivatives of the coordinate 
path function as are needed. 

Given a coordinate path q = y o 7 the rest of the local tuple can 
be computed from it. We introduce a function T that does this 

!%](*) = (t,q(t),Dq(t),...) . (1.6) 

The evaluation of T only involves taking derivatives of the coordi- 
nate path q = y ° 7! the function T does not depend on y. From 
relations (1.5) and (1.6) we find 

r[g]=a x oT[ 7 ]. (1.7) 

Exercise 1.3: Generalized coordinates 

For each of the systems described in exercise 1.2 specify a system of 
generalized coordinates that can be used to describe the behavior of the 
system. 

Lagrangians in generalized coordinates 

The action is a property of a configuration path segment for a 
particular Lagrangian £. The action does not depend on the co- 
ordinate system that is used to label the configurations. We can 
use this property to find a coordinate representation L x for the 
Lagrangian C. 



23 The formal definition ofEB is unimportant to the discussion, but if you really 
want to know here is one way to do it: 

First, we define the derivative 2 >y of a configuration path 7 in terms of 
ordinary derivatives by specifying how it acts on sufficiently smooth real- 
valued functions / of configurations: (T) n ^)(t )(/) = D n (f o 7 )(t). Then we 
define S3 x (a, b, c, d , . . .) = (a, y(6), c(y), d(x), • • •) • With this definition: 

^ Y (t,7(t),T>7(t),D 2 7(f),. . .) = (LxWOb^T^Xxb^T^Xx), • • •) 

= (L X 0 7 {t), D(x 0 i){t), D 2 (x o 7 )(t), ■ ■ •) 

= ( t,l(t),Dq{t),D 2 q(t ),...) . 
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The action is 

«S[7](*i,*2) = [ £° r [l\- (1-8) 

Jt i 

The Lagrangian £ is a function of the local tuple T[x\(t) = 
(t, 7(i), • • •)• The local tuple has the coordinate represen- 

tation T[g] = M X o T[ 7], where q = x° 7- So if we choose 24 

L x = Com-\ (1.9) 

then 25 

L x oY[q] = CoT[ 1 \. (1.10) 

On the left we have the composition of functions that use the 
intermediary of a coordinate representation; on the right we have 
the composition of two functions that do not involve coordinates. 
We define the coordinate representation of the action to be 

SxfaK*!^) = / T x oT[<?]. (Til) 

Jt 1 

The function S x takes a coordinate path; the function S takes a 
configuration path. Since the integrands are the same by equa- 
tion (1.10) the integrals have the same value: 

<5[7](ii,t 2 ) = S x [xo^](ti,t 2 ). (1.12) 

So we have a way of constructing coordinate representations of a 
Lagrangian that gives the same action for a path in any coordinate 
system. 

For Lagrangians that depend only on positions and velocities 
the action can also be written 

S x [q\(h,t 2 ) = [ L x (t,q(t),Dq(t))dt. (1.13) 

Jt 1 



24 The coordinate function x is locally invertible, and so isE 3 x . 
25 £oT[7] =£o1 4 o| y oT[7] = L x o T[x o 7] =L x oT[q], 
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The coordinate system used in the definition of a Lagrangian or 
an action is usually unambiguous, so the subscript x will usually 
be dropped. 



1.4 Computing Actions 

To illustrate the above ideas, and to introduce their formulation as 
computer programs, we consider the simplest mechanical system — 
a free particle moving in three dimensions. Euler and Lagrange 
discovered that for a free particle the time-integral of the kinetic 
energy over the particle’s actual path is smaller than the same 
integral along any alternative path between the same points: a 
free particle moves according to the principle of stationary action, 
provided we take the Lagrangian to be the kinetic energy. The ki- 
netic energy for a particle of mass m and velocity v is \m.v 2 , where 
v is the magnitude of v. In this case we can choose the generalized 
coordinates to be the ordinary rectangular coordinates. 

Following Euler and Lagrange, the Lagrangian for the free par- 
ticle is 26 

L(t, x, v) = \m(v ■ v), (1-14) 

where the formal parameter x names a tuple of components of 
the position with respect to a given rectangular coordinate sys- 
tem, and where the formal parameter v names a tuple of velocity 
components. 27 

We can express this formula as a procedure: 



26 Here we are making a function definition. A definition specifies the value 
of the function for arbitrarily chosen formal parameters. One may change 
the name of a formal parameter, so long as the new name does not conflict 
with any other symbol in the definition. For example, the following definition 
specifies exactly the same free-particle Lagrangian: 

L(a, b, c) = \m{c ■ c). 



27 The Lagrangian is formally a function of the local tuple, but any particular 
Lagrangian only depends on a finite initial segment of the local tuple. We 
define functions of local tuples by explicitly declaring names for the elements 
of the initial segment of the local tuple that includes the elements upon which 
the function depends. 
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(define ( (L-f ree-particle mass) local) 

(let ((v (velocity local))) 

(* 1/2 mass (dot-product v v)))) 

The definition indicates that L-f ree-particle is a procedure that 
takes mass as an argument and returns a procedure that takes 
a local tuple local, 28 extracts the generalized velocity with the 
procedure velocity, and uses the velocity to compute the value 
of the Lagrangian. 

Suppose we let q denote a coordinate path function that maps 
time to position components: 29 

q(t) = (x(t),y(t),z(t)) . (1.15) 

We can make this definition 30 

(define q 

(up (literal-function ’x) 

(literal-function ’y) 

(literal-function ’ z))) 

where literal-function makes a procedure that represents a 
function of one argument that has no known properties other than 
the given symbolic name. 31 The symbol q now names a procedure 



28 We represent the local tuple as a composite data structure, the components 
of which are the time, the generalized coordinates, the generalized velocities, 
and possibly higher derivatives. We do not want to be bothered by the details 
of packing and unpacking the components into these structures, so we provide 
utilities for doing this. The constructor ->local takes the time, the coor- 
dinates, and the velocities and returns a data structure representing a local 
tuple. The selectors time, coordinate, and velocity extract the appropri- 
ate pieces from the local structure. The procedures time = (component 0), 
coordinate = (component 1) and velocity = (component 2). 

29 Be careful. The x in the definition of q is not the same as the x that was used 
as a formal parameter in the definition of the free-particle Lagrangian above. 
There are only so many letters in the alphabet, so we are forced to reuse them. 
We will be careful to indicate where symbols are given new meanings. 

30 A tuple of coordinate or velocity components is made with the procedure 
up. Component i of the tuple q is (ref q i). All indexing is zero based. The 
word up is to remind us that in mathematical notation these components are 
indexed by superscripts. There are also down tuples of components that are 
indexed by subscripts. See the appendix on notation. 

31 In our system, arithmetic operators are generic over symbols and expressions 
as well as numeric values; so arithmetic procedures can work uniformly with 
numbers or expressions. For example, if we have the procedure (define (cube 
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of one real argument (time) that produces a tuple of three com- 
ponents representing the coordinates at that time. For example, 
we can evaluate this procedure for a symbolic time t as follows: 

(print-expression (q ’t)) 

(up (x t) (y t) (z t) ) 

The procedure print-expression produces a printable form of 
the expression. The procedure print-expression simplifies ex- 
pressions before printing them. 

The derivative of the coordinate path Dq is the function that 
maps time to velocity components: 

Dq(t) = (Dx(t),Dy(t),Dz(t)). 

We can make and use the derivative of a function. 32 For example, 
we can write: 

(print-expression ( (D q) ’t)) 

(up ( (D x) t) ( (D y) t) ((D z) t) ) 

The function T takes a coordinate path and returns a function of 
time that gives the local tuple ( t , q(t), Dq(t ), . . .). We implement 
this r with the procedure Gamma. Here is what Gamma does: 

(print-expression ((Gamma q) ’t)) 

(up t 

(up (x t) (y t) (z t)> 

(up ( (D x) t) ( (D y) t) ((D z) t) ) ) 

So the composition L o T is a function of time that returns the 
value of the Lagrangian for this point on the path: 

(print-expression 

((compose (L-free-particle ’m) (Gamma q) ) ’t)) 

(+ (* 1/2 m (expt ( (D x) t > 2)) 

(* 1/2 m (expt ( (D y) t) 2)) 

(* 1/2 m (expt ( (D z) t) 2))) 



x ) (* x x x)) we can obtain its value for a number (cube 2) => 8 or for a 
literal symbol (cube ’a) => (* a a a). 

32 Derivatives of functions yield functions. For example, ((D cube) 2) => 12 
and ( (D cube) ’a) => (* 3 (expt a 2)). 
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The procedure show-expression is like print-expression except 
that it puts the simplified expression into traditional infix form 
and displays the result. 33 Most of the time we will use this method 
of display, to make the boxed expressions that appear in this book. 
It also produces the prefix form as returned by print-expression, 
but we will usually not show this. 34 

(show-expression 

((compose (L-free-particle ’m) (Gamma q)) ’t)) 



{Dx ( t )) 2 + ]-m (. Dy ( t )) 2 + ]- m ( Dz ( t )) 2 



According to equation (1.11) we can compute the Lagrangian 
action from time t,\ to time t ‘2 as: 

(define (Lagrangian-action L q tl t2) 

(definite-integral (compose L (Gamma q) ) tl t2)) 

Lagrangian-action takes as arguments a procedure L that com- 
putes the Lagrangian, a procedure q that computes a coordinate 
path, and starting and ending times tl and t2. The definite- 
integral used here takes as arguments a function and two lim- 
its tl and t2, and computes the definite integral of the function 
over the interval from tl to t2. 35 Notice that the definition of 
Lagrangian-action does not depend on any particular set of co- 
ordinates or even the dimension of the configuration space. The 
method of computing the action from the coordinate representa- 
tion of a Lagrangian and a coordinate path does not depend on 
the coordinate system. 

We can now compute the action for the free particle along a 
path. For example, consider a particle moving at uniform speed 



33 The display is generated with TpjX. 

34 For very complicated expressions the prefix notation of Scheme is often bet- 
ter, but simplification is almost always useful. We can separate the functions 
of simplification and infix display. We will see examples of this later. 

35 Scmutils includes a variety of numerical integration procedures. The ex- 
amples in this section were computed by rational-function extrapolation of 
Euler-MacLaurin formulas with a relative error tolerance of 10 _1 °. 
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along a straight line t e- > (4f + 7, 3 1 + 5,2 1 + l). 36 We represent 
the path as a procedure 

(define (test-path t) 

(up (+ (* 4 t) 7) 

(+ (* 3 t) 5) 

(+ (* 2 t) 1))) 

For a particle of mass 3, we obtain the action between t = 0 and 

t = 10 as 3 ' 

(Lagrangian-action (L-free-particle 3.0) test-path 0.0 10.0) 

435 . 

Exercise 1.4: Lagrangian actions 

For a free particle an appropriate Lagrangian is 38 

L(t, x, v ) = |mt 2 . 

Suppose that x is the constant-velocity straight-line path of a free par- 
ticle, such that x a = x(t a ) and Xb = x(tb). Show that the action on the 
solution path is 

to (xb - Xg ) 2 
2" t b -ta 

Paths of minimum action 

We already know that the actual path of a free particle is uniform 
motion in a straight line. According to Euler and Lagrange the 
action is smaller along a straight-line test path than along nearby 
paths. Let q be a straight-line test path with action S[g](fi, t?). 
Let q + et] be a nearby path, obtained from q by adding a path 



36 Surely for a real physical situation we would have to specify units for these 
quantities. In this illustration we do not give units. 

37 Here we use decimal numerals to specify the parameters. This forces the 
representations to be floating point, which is efficient for numerical calculation. 
If symbolic algebra is to be done it is essential that the numbers be exact 
integers or rational fractions, so that expressions can be reliably reduced to 
lowest terms. Such numbers are specified without a decimal point. 

38 The squared magnitude of the velocity is v ■ v, the vector dot-product of 
the velocity with itself. The square of a structure of components is defined to 
be the sum of the squares of the individual components, so we write simply 
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variation ij scaled by the real parameter e. 39 The action on the 
varied path is S[q + ery] (ti , ^2) - Euler and Lagrange found S[q + 
erj\(ti,t2) > S[q\(ti,t2) for any rj that is zero at the endpoints and 
for any small non-zero e. 

Let’s check this numerically by varying the test path, adding 
some amount of a test function that is zero at the endpoints t = t\ 
and t = t,2- To make a function 7/ that is zero at the endpoints, 
given a sufficiently well-behaved function v, we can use 77(f) = 

(t — t\)(t — t2)v{t). This can be implemented: 

(define ((make-eta nu tl t2) t) 

(* (- t tl) (- t t2) (nu t))) 

We can use this to compute the action for a free particle over a 
path varied from the given path, as a function of e: 40 

(define ( (varied-f ree-particle-action mass q nu tl t2) epsilon) 
(let ((eta (make-eta nu tl t2))) 

(Lagrangian-action (L-free-particle mass) 

(+ q (* epsilon eta)) 
tl 

t2) ) ) 

The action for the varied path, with v(t) = (sinf,cosf,t 2 ), and 
e = 0.001 is, as expected, larger than for the test path: 

( (varied-f ree-particle-action 3.0 test-path 

(up sin cos square) 

0.0 10 . 0 ) 

0 . 001 ) 

436.29121428571153 



39 Note that we are doing arithmetic on functions. We extend the arithmetic 
operations so that the combination of two functions of the same type (same 
domains and ranges) is the function on the same domain that combines the 
values of the argument functions in the range. For example, if / and g are 
functions of t, then fg is the function t 1 — > f{t)g(t). A constant multiple of 
a function is the function whose value is the constant times the value of the 
function for each argument: cf is the function 1 1 — > c/(t). 

40 Note that we are adding procedures. Paralleling our extension of arithmetic 
operations to functions, arithmetic operations are extended to compatible pro- 
cedures. 
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We can numerically compute the value of e for which the action 
is minimized. We search between, say —2 and l: 41 

(minimize 

(varied-free-particle-action 3.0 test-path 

(up sin cos square) 

0.0 10 . 0 ) 

- 2.0 1 . 0 ) 

(-1.598 7211554602254e-l 4 435.000000000023 7 5) 

We find exactly what is expected — that the best value for e is 
zero, 2 and the minimum value of the action is the action along 
the straight path. 

Finding trajectories that minimize the action 

We have used the variational principle to determine if a given 
trajectory is realizable. We can also use the variational princi- 
ple to actually find trajectories. Given a set of trajectories that 
are specified by a finite number of parameters, we can search the 
parameter space looking for the trajectory in the set that best ap- 
proximates the real trajectory by finding one that minimizes the 
action. By choosing a good set of approximating functions we can 
get arbitrarily close to the real trajectory. 43 

One way to make a parametric path that has fixed endpoints 
is to use a polynomial that goes through the endpoints as well 
as a number of intermediate points. Variation of the positions 
of the intermediate points varies the path; the parameters of the 
varied path are the coordinates of the intermediate positions. The 
procedure make-path constructs such a path using a Lagrange 



41 The arguments to minimize are a procedure implementing the univariate 
function in question, and the lower and upper bounds of the region to be 
searched. Scmutils includes a choice of methods for numerical minimization; 
the one used here is Brent’s algorithm, with an error tolerance of 10 -5 . The 
value returned by minimize is a list of 3 numbers: the first is the argument 
at which the minimum occurred, the second is the minimum obtained, and 
the third is the number of iterations of the minimization algorithm required 
to obtain the minimum. 

42 Yes, -1.5987211554602254e-14 is zero for the tolerance required of the min- 
imizer. And the 435.0000000000237 is arguably the same as 435 obtained 
before. 

43 There are lots of good ways to make such a parametric set of approximating 
trajectories. One could use splines or higher-order interpolating polynomials; 
one could use Chebyshev polynomials; one could use Fourier components. The 
choice depends upon the kinds of trajectories one wants to approximate. 
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interpolation polynomial . 44 The procedure make-path is called 
with five arguments: (make-path tO qO tl ql qs) , where qO and 
ql are the endpoints, tO and tl are the corresponding times, and 
qs is a list of intermediate points. 

Having specified a parametric path we can construct a paramet- 
ric action that is just the action computed along the parametric 
path: 

(define ( (parametric-path-action Lagrangian tO qO tl ql) qs) 
(let ((path (make-path tO qO tl ql qs))) 

(Lagrangian-action Lagrangian path tO tl)))) 

We can find approximate solution paths by finding parameters 
that minimize the action. We do this minimization with a canned 
multidimensional minimization procedure : 45 

(define (find-path Lagrangian tO qO tl ql n) 

(let ((initial-qs (linear-interpolants qO ql n) ) ) 

(let ( (minimizing-qs 

(multidimensional-minimize 
(parametric-path-action Lagrangian tO qO tl ql) 
initial-qs) ) ) 

(make-path tO qO tl ql minimizing-qs)))) 



44 Here is one way to implement make-path: 

(define (make-path tO qO tl ql qs) 

(let ((n (length qs))) 

(let ((ts (linear-interpolants tO tl n) ) ) 

(Lagrange-interpolation-f unction 
(append (list qO) qs (list ql)) 

(append (list tO) ts (list tl)))))) 

The procedure linear-interpolants produces a list of elements that linearly 
interpolate the first two arguments. We use this procedure here to specify ts, 
the n evenly spaced intermediate times between to and tl at which the path 
will be specified. The parameters being adjusted, qs, are the positions at these 
intermediate times. The procedure Lagrange-interpolation-function takes 
a list of values and a list of times and produces a procedure that computes 
the Lagrange interpolation polynomial that goes through these points. 

45 The minimizer used here is the Nelder-Mead downhill simplex method. As 
usual with numerical procedures, the interface to the nelder-mead procedure 
is complex, with lots of optional parameters to allow the user to control errors 
effectively. For this presentation we have specialized nelder-mead by wrapping 
it in the more palatable multidimensional-minimize. Unfortunately, you will 
have to learn to live with complicated numerical procedures someday. 
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The procedure multidimensional -minimize takes a procedure (in 
this case the value of the call to action-on-parametric-path) that 
computes the function to be minimized (in this case the action) 
and an initial guess for the parameters. Here we choose the initial 
guess to be equally-spaced points on a straight line between the 
two endpoints, computed with linear-interpolants. 

To illustrate the use of this strategy, we will find trajectories of 
the harmonic oscillator, with Lagrangian 46 

L(t.,q,v) = \mv 2 — \kq 2 , (1.16) 

for mass m and spring constant k. This Lagrangian is imple- 
mented by 

(define ( (L-harmonic m k) local) 

(let ((q (coordinate local)) 

(v (velocity local))) 

(- (* 1/2 m (square v) ) (* 1/2 k (square q) ) ) ) ) 

We can find an approximate path taken by the harmonic oscil- 
lator for m = 1 and k = 1 between g(0) = 1 and q(i r/2) = 0 as 
follows: 47 

(define q (find-path (L-Harmonic 1.0 1.0) 0. 1. :pi/2 0. 3)) 

We know that the trajectories of this harmonic oscillator, for 
in, — 1 and k = 1, are 

q(t) = Acos(t + (1-17) 

where the amplitude A and the phase <p are determined by the 
initial conditions. For the chosen endpoint conditions the solution 
is q(t) = cos (t). The approximate path should be an approxima- 
tion to cosine over the range from 0 to ir/2. Figure 1.1 shows the 
error in the polynomial approximation produced by this process. 
The maximum error in the approximation with three intermedi- 
ate points is less than 1.7 x 10~ 4 . We find, as expected, that the 
error in the approximation decreases as the number of intermedi- 



46 Don’t worry. We know that you don’t yet know why this is the right La- 
grangian. We will get to this in section 1.6. 

47 By convention, named constants have names that begin with colon. The 
constants named :pi and :-pi are what we would expect from their names. 
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Figure 1.1 The difference between the polynomial approximation 
with minimum action and the actual trajectory taken by the harmonic 
oscillator. The abscissa is the time and the ordinate is the error. 

ate points is increased. For four intermediate points it is about a 
factor of 15 better. 

Exercise 1.5: Solution process 

We can watch the progress of the minimization by modifying the proce- 
dure parametric-path-action to plot the path each time the action is 
computed. Try this: 

(define win2 (frame 0. :pi/2 0. 1.2)) 

(define ((parametric-path-action Lagrangian tO qO tl ql) 
int ermedi at e -qs ) 

(let ((path (make-path tO qO tl ql intermediate-qs) ) ) 

; ; display path 
(graphics-clear win2) 

(plot-function win2 path tO tl (/ (- tl tO) 100)) 

; ; compute action 

(Lagrangian-action Lagrangian path tO tl))) 

(find-path (L-harmonic 1. 1.) 0. 1. :pi/2 0. 2) 



Exercise 1.6: Minimizing action 

Suppose we try to obtain a path by minimizing an action for an im- 
possible problem. For example, suppose we have a free particle and we 
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impose endpoint conditions on the velocities as well as the positions that 
are inconsistent with the particle being free. Does the formalism protect 
itself from such an unpleasant attack? You may find it illuminating to 
program it and see what happens. 



1.5 The Euler- Lagrange Equations 

The principle of stationary action characterizes the realizable 
paths of systems in configuration space as those for which the 
action has a stationary value. In elementary calculus, we learn 
that the critical points of a function are the points where the 
derivative vanishes. In an analogous way, the paths along which 
the action is stationary are solutions of a system of differential 
equations. This system, called the Euler- Lagrange equations or 
just the Lagrange equations, is the link that permits us to use 
the principle of stationary action to compute the motions of me- 
chanical systems, and to relate the variational and Newtonian 
formulations of mechanics. 48 

Lagrange equations 

We will find that if L is a Lagrangian for a system that depends 
on time, coordinates, and velocities, and if q is a coordinate path 
for which the action S[g](ti, £ 2 ) is stationary (with respect to any 
variation in the path that keeps the endpoints of the path fixed) 
then 

D{d 2 LoT[q]) -diLoT[q\ = 0. (1.18) 

Here L is a real- valued function of a local tuple; d\L and d 2 L 
denote the partial derivatives of L with respect to its general- 
ized position and generalized velocity arguments. 49 The function 
'd 2 L maps a local tuple to a structure whose components are the 
derivatives of L with respect to each component of the gener- 
alized velocity. The function T[g] maps time to the local tuple: 
r[g](i) = (t, q(t), Dq(t ), . . .). Thus the compositions <9iLoT[g] and 



48 This result was initially discovered by Euler and later rederived by Lagrange. 

49 The derivative or partial derivative of a function that takes structured argu- 
ments is a new function that takes the same number and type of arguments. 
The range of this new function is itself a structure with the same number of 
components as the argument with respect to which the function is differenti- 
ated. 
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d 2 LoT[q\ are functions of one argument, time. The Lagrange equa- 
tions assert that the derivative of c^-LoTfg] is equal to diLoT[q], 
at any time. Given a Lagrangian, the Lagrange equations form a 
system of ordinary differential equations that must be satisfied by 
realizable paths. 50 

1.5.1 Derivation of the Lagrange Equations 

We will show that Principle of Stationary Action implies that 
realizable paths satisfy a set of ordinary differential equations. 
First we will develop tools for investigating how path-dependent 
functions vary as the paths are varied. We will then apply these 
tools to the action, to derive the Lagrange equations. 

Varying a path 

Suppose that we have a function f[q\ that depends on a path q. 
How does the function vary as the path is varied? Let q be a 
coordinate path and q + ei] be a varied path, where the function 
i] is a path-like function that can be added to the path q, and the 
factor e is a scale factor. We define the variation 8 n f\q\ of the 
function / on the path q by 51 

d-19) 



S0 Lagrange’s equations are traditionally written in the form 

_dL_ 
dt dq dq 



or, if we write a separate equation for each component of q, as 



d dL 8L 

dt dq i 8q i 



i = 0, . . . , n — 1 . 



In this way of writing Lagrange’s equations the notation does not distinguish 
between L, which is a real- valued function of three variables (t, q , q), and L o 
r[q], which is a real- valued function of one real variable t. If we do not realize 
this notational pun, the equations don’t make sense as written — dL/dq is a 
function of three variables, so we must regard the arguments q, q as functions 
of t before taking d/dt of the expression. Similarly, dL/dq is a function of 
three variables, which we must view as a function of t before setting it equal 
to d/dt(dL/dq). These implicit applications of the chain rule pose no problem 
in performing hand computations — once you understand what the equations 
represent. 

51 The variation operator S v is like the derivative operator in that it acts on 
the immediately following function: 5 v f[q] = (S v f)[q\. 
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The variation of / is a linear approximation to the change in the 
function / for small variations in the path. The variation of / 
depends on q. 

A simple example is the variation of the identity path function: 
I[q] = q. Applying the definition 

= (1.20) 

It is traditional to write 5^I[q\ simply as 5q. Another example is 
the variation of the path function that returns the derivative of 
the path. We have 

D(q + erj) - Dq ^ 

It is traditional to write 5 v g[q\ as 5Dq. 

The variation may be represented in terms of a derivative. Let 
5(e) = /[5 + er?], then 

S v f[q] = lim ( g(6) ~ g(0) ) = Dg( 0). (1.22) 

Variations have the following derivative-like properties. For 
path-dependent functions / and g and constant c: 



$v(fg)[q] = <Vfe] 9[q\ + f[q\ ^ g[q\ 


(1.23) 


<W + g)[q\ = S v f[q] + S v g[q] 


(1.24) 


<V c /)[<?] = c S v f[q\. 


(1.25) 


Let F be a path-independent function and let g be 
function, then 


a path-dependent 


S v h[q\ = ( DF o g[q\) 6 v g[q] with h[q\ = F o g[q\. 


(1.26) 


The operators D (differentiation) and 5 (variation) commute in 
the following sense: 


D5 v f[q\ = S v g[q\ with g[q] = D(f[q\). 


(1.27) 



Variations also commute with integration in a similar sense. 

If a path-dependent function / is stationary for a particular 
path q with respect to small changes in that path then it must be 



Dq with g[q\ = Dq. (1-21) 



$v9[<l\ = lin ), 
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stationary for a subset of those variations that result from adding 
small multiples of a particular function r/ to q. So the statement 
5 r] f [q] = 0 for arbitrary r/ implies the function / is stationary for 
small variations of the path around q. 

Exercise 1.7: Properties of 5 

Show that 5 has the properties 1.23-1.27. 

Exercise 1.8: Implementation of <5 

a. Suppose we have a procedure f that implements a path-dependent 
function: for path q and time t it has the value ((f q) t). The proce- 
dure delta computes the variation (5 V f)[q\(t) as the value of ((((delta 
eta) f) q) t). Complete the definition of delta: 

(define ((((delta eta) f) q) t) 

) 



b. Use your delta procedure to verify the properties of 5 listed in ex- 
ercise 1.7 for simple functions such as implemented by the procedure f: 

(define ((F q) t) 

((literal-function ’f) (q t))) 

This implements a simple path-dependent function that depends only 
on the coordinates of the path at each moment. 

Varying the action 

The action is the integral of the Lagrangian along a path: 

S[q\(ti,t 2 ) = [ LoV[q\. (1.28) 

Jt ! 

For a realizable path q the variation of the action with respect to 
any variation that preserves the endpoints, rj(t\) = rjfo) = 0, is 
zero: 



S v S[q\(t 1 ,t 2 )=0. (1.29) 

The variation of the action is 
rt2 

3r]S[q\(ti,t2) = / fiqh[q\ where h[q\=LoT[q\. (1.30) 

Jt i 

This follows from the fact that variation commutes with integra- 
tion. 




30 



Chapter 1 Lagrangian Mechanics 



Using the fact that 

= (Q,V,Drf), (1.31) 

which follows from equations (1.20) and (1.21), and using the chain 
rule for variations (1.26) we get 52 

S v S[q](ti,t 2 ) = [ {DLoT[q\)5 v T[q\ 

Jtx 

= f 2 {(d 1 LoT[q\)r 1 +(d 2 LoT[q\)D V ). (1.32) 

Jt i 

Integrating the last term of equation (1.32) by parts gives 

S v S[q}(t 1 ,t 2 ) = {d 2 LoY[q])q\\\ 

+ f 2 {(d 1 LoT[q\)-D(d 2 LoT[q\)}7 1 . (1.33) 

Jt i 

For our variation q we have q(ti) = q(t 2 ) = 0 so the first term 
vanishes. 

So the variation of the action is zero if and only if 

0= f\{d 1 LoY[q\)-D{d 2 LoY[q])}q. (1.34) 

Jt j. 

The variation of the action is zero because, by assumption, q is a 
realizable path. Thus (1.34) must be true for any function q that 
is zero at the endpoints. 

We retain enough freedom in the choice of the variation so that 
this forces the factor in the integrand multiplying q to be zero at 
each point along the path. We argue by contradiction: Suppose 
this factor were nonzero at some particular time. Then it would 
have to be nonzero in at least one of its components. But if we 
choose our q to be a bump that is nonzero only in that component 
in a neighborhood of that time, and zero everywhere else, then the 



52 A function of multiple arguments is considered a function of a tuple of its 
arguments. Thus, the derivative of a function of multiple arguments is a 
tuple of the partial derivatives of that function with respect to each of the 
arguments. So in the case of a Lagrangian L 

DL{t, q, v) = [d 0 L(t, q, v), di L(t, q, v), <9 2 L(f, q, u)] . 
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integral will be nonzero. So we may conclude that the factor in 
curly brackets is identically zero: 53 

D(d 2 LoT[q]) - (diLoTfg]) = 0. (1.35) 

This is just what we set out to obtain, the Lagrange equations. 

A path satisfying Lagrange’s equations is one for which the 
action is stationary, and the fact that the action is stationary de- 
pends only on the values of L at each point of the path (and at 
each point on nearby paths), but not on the coordinate system we 
use to compute these values. So if the system’s path satisfies La- 
grange’s equations in some particular coordinate system, it must 
satisfy Lagrange’s equations in any coordinate system. Thus the 
equations of variational mechanics are derived the same way in 
any configuration space and any coordinate system. 

Harmonic oscillator 

For an example, consider the harmonic oscillator. A Lagrangian 
is 

L(t,x,v ) = \mv 2 — ^kx 2 . (1.36) 

Then 

d\ L(t,x,v) = —kx and d 2 L(t,x,v) = mv. (1.37) 

The Lagrangian is applied to a tuple of the time, a coordinate, 
and a velocity. The symbols t. x, and v are arbitrary; they are 
used to specify formal parameters of the Lagrangian. 

Now suppose we have a configuration path y, which gives the 
coordinate of the oscillator y(t) for each time t. The initial seg- 
ment of the corresponding local tuple at time t is 

r M(i) = (t,y(t),Dy(t )) . (1.38) 

So 

d\L o T[y](i) = — ky(t) and d 2 L oT[y\(t.) = mDy(t), (1.39) 



53 To make this argument more precise requires careful analysis. 
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and 



D(d 2 LoT[y})(t)=mD 2 y(t), (1.40) 

so the Lagrange equation is 

mD 2 y(t) + ky(t) = 0, (1.41) 

which is the equation of motion of the harmonic oscillator. 

Orbital motion 

As another example, consider the two-dimensional motion of a 
particle of mass m with gravitational potential energy —y/r, 
where r is the distance to the center of attraction. A Lagrangian 
is 54 

L(t-,£,TKVz,Vr,) = \m{vl + v 2 ) + J* , (1.42) 

1 v€+v 

where £ and q are formal parameters for rectangular coordinates 
of the particle, and v £ and v ^ are formal parameters for corre- 
sponding rectangular velocity components. Then 55 



<h L(t;€,r);vs,Vr,) 



[<9i,o L(t\ £, r ) ; v$, v v ),d ltl L(t; £, r/; v^, v v )} 





.{e+rff 2 ’ (e+v 2 f 2 . ' 



(1.43) 



Similarly, 



L(t; £, rj] v$, v v ) = [mv^,mv n \ . (1-44) 

Now suppose we have a configuration path q = (x,y), so that 
the coordinate tuple at time t is q(t) = (x(t),y(t)). The initial 
segment of the local tuple at time t is 

r [q\(t) = (t\x(t),y(t);Dx(t),Dy(t)) . (1.45) 



54 When we write a definition that names the components of the local tuple, we 
indicate that these are grouped into time, position, and velocity components 
by separating the groups with semicolons. 

55 The derivative with respect to a tuple is a tuple of the partial derivatives 
with respect to each component of the tuple (see the appendix on notation). 
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So 

diLo T[g](t) 
d 2 Lo T[g](i) 



~Ex{t) -yy(t) 

-((*(‘)) 2 + te < tm 3/2 ’ ((*( t )) 2 + ( y ( t )) 2 ) 3/2 - 

[mDa;(t),mD!/(t)] (1-46) 



and 



D(d 2 L o T[g])(i) = [mD 2 x(f),mD 2 !/(t)] . 

The component Lagrange equations at time t are 
lix(t ) 



(1.47) 



mD 2 x{t ) + 
mD 2 y(t ) + 



(«t )) 2 + (j/(t )) 2 ) 3/2 

M*) 

( Mi )) 2 + < 3 /( 0) 2 ) 3/2 



= 0 
= 0 . 



(1.48) 



Exercise 1.9: Lagrange’s equations 

Derive the Lagrange equations for the following systems, showing all of 
the intermediate steps as we did in the harmonic oscillator and orbital 
motion examples. 

a. A particle of mass m moves in a two-dimensional potential V ( x , y ) = 
( x 2 + y 2 )/ 2 + x 2 y — y 3 / 3, where x and y are rectangular coordinates of 
the particle. A Lagrangian for this system is L(t; x, y; v x , v v ) = ^m(v 2 + 
vl) -V(x,y). 

b. An ideal planar pendulum consists of a bob of mass m connected to 
a pivot by a massless rod of length l subject to uniform gravitational 
acceleration g. A Lagrangian for this system is L(t, 0,0) = \ml 2 9 2 + 
mgl cos 6. The formal parameters of L are t, 9 , and 0; 6 measures the 
angle of the pendulum rod to a plumb-line and 9 is the angular velocity 
of the rod. 56 

c. A Lagrangian for a particle of mass m constrained to move on a 
sphere of radius R is L(t;9,ip-,a,(3) = \mR 2 (a 2 + (/3sin0) 2 ). The angle 
9 is colatitude of the particle is and tp is the longitude; the rate of change 
of the colatitude is a and the rate of change of the longitude is /?. 



56 The symbol 9 is just a mnemonic symbol; the dot over the 9 is not intended 
to indicate differentiation. To define L we could have just as well have written: 
L(a,b,c) = | ml 2 c 2 + mglcosb. However, we use a dotted symbol to remind 
us that the argument matching a formal parameter, such as 9, is a rate of 
change of an angle, such as 9. 
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Exercise 1.10: Higher derivative Lagrangians 

Derive Lagrange’s equations for Lagrangians that depend on the acceler- 
ations. In particular, show that the Lagrange equations for Lagrangians 
of the form L(f, q , q , q) with q terms are: 5 ' 

D 2 {d 3 L o r[<?]) - D{d 2 L o T[q}) + d x L o T[q\ = 0. (1.49) 

In general, these equations, first derived by Poisson, will involve the 
fourth derivative of q. Note that the derivation is completely analogous 
to the derivation of the Lagrange equations without accelerations; it is 
just longer. What restrictions must we place on the variations so that 
the critical path satisfies a differential equation? 

1.5.2 Computing Lagrange’s Equations 

The procedure for computing Lagrange’s equations mirrors the 
functional expression (1.18), where the procedure Gamma imple- 
ments T: 58 

(define ( (Lagrange-equations Lagrangian) q) 

(- (D (compose ((partial 2) Lagrangian) (Gamma q) ) ) 

(compose ((partial 1) Lagrangian) (Gamma q)))) 

The argument of Lagrange-equations is a procedure that com- 
putes a Lagrangian. It returns a procedure that when applied to 
a path q returns a procedure of one argument (time) that com- 
putes the left-hand side of the Lagrange equations (1.18). These 
residual values are zero if q is a path for which the Lagrangian 
action is stationary. 

Observe that the Lagrange-equations procedure, like the La- 
grange equations themselves, is valid for any generalized coordi- 
nate system. When we write programs to investigate particular 
systems, the procedures that implement the Lagrangian function 
and the path q will reflect the actual coordinates chosen to rep- 
resent the system, but we use the same Lagrange-equations pro- 
cedure in each case. This abstraction reflects the important fact 



57 In traditional notation these equations read 

d 2 dL _ d dL 9L_ 
dt 2 dq dt dq ^ dq 

58 The Lagrange-equations procedure uses the operations (partial 1) and 
(partial 2), which implement the partial derivative operators with respect 
to the second and third argument positions (those with indices 1 and 2). 
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that the method of derivation of Lagrange’s equations from a La- 
grangian is always the same; it is independent of the number of 
degrees of freedom, the topology of the configuration space, and 
the coordinate system used to describe points in the configuration 
space. 

The free particle 

Consider again the case of a free particle. The Lagrangian is 
implemented by the procedure L-free-particle. Rather than 
numerically integrating and minimizing the action, as we did in 
section 1.4, we can check Lagrange’s equations for an arbitrary 
straight-line path 1 (at + ao, bt + bo, ct + co) 

(define (test-path t) 

(up (+ (* ’a t) ’aO) 

(+ (* ’b t) ’bO) 

(+ (* >c t) ’ cO) ) ) 

(print-expression 

( ( (Lagrange-equations (L-free-particle ’m)) 
test-path) 

’t)) 

(down 0 0 0) 



That the residuals are zero indicates that the test-path satisfies 
the Lagrange equations. 59 

Instead of checking the equations for an individual path in 
three-dimensional space, we can also apply the Lagrange-equations 
procedure to an arbitrary function: 60 

(show-expression 

(((Lagrange-equations (L-free-particle ’m)) 

(literal-function ’x)) 

’t) ) 

(* ( ( (expt D 2) x) t) m) 



59 There is a Lagrange equation for every degree of freedom. The residuals of 
all the equations are zero if the path is realizable. The residuals are arranged 
in a down tuple because they result from derivatives of the Lagrangian with 
respect to argument slots that take up tuples. See the appendix on notation. 

60 Observe that the second derivative is indicated as the square of the derivative 
operator (expt D 2) . Arithmetic operations in Scmutils extend over operators 
as well as functions. 
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mD 2 x ( t ) 



The result is an expression containing the arbitrary time t, and 
mass in, so it is zero precisely when D 2 x = 0, which is the expected 
equation for a free particle. 

The harmonic oscillator 

Consider the harmonic oscillator again, with Lagrangian (1.16). 
We know that the motion of a harmonic oscillator is a sinusoid 
with a given amplitude, frequency and phase: 

x(t) = acos(cot + (p). (1.50) 

Suppose we have forgotten how the constants in the solution relate 
to the physical parameters of the oscillator. Let’s plug in the 
proposed solution and look at the residual: 

(define (proposed-solution t) 

(* ’a (cos (+ (* ’omega t) ’phi)))) 

(show-expression 

( ( (Lagrange-equations (L-harmonic ’m ’k)) 
proposed-solution) 

’t)) 



cos {cot + ip)a(k — mw 2 ) 



The residual here shows that for nonzero amplitude, the only so- 
lutions allowed are ones where (k — mco 2 ) = 0, or w = y/kjm. 

Exercise 1.11: 

Compute Lagrange’s equations for the Lagrangians in exercise 1.9 using 
the Lagrange-equations procedure. Additionally, use the computer to 
perform each of the steps in the Lagrange-equations procedure and 
show the intermediate results. Relate these steps to the ones you showed 
in the hand derivation of exercise 1.9. 

Exercise 1.12: 

a. Write a procedure to compute the Lagrange equations for Lagrangians 
that depend upon acceleration, as in exercise 1.10. 
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b. Use your procedure to compute the Lagrange equations for the La- 
grangian 

L(t,x,v,a) = — \mxa — ^kx 2 . 

Do you recognize the resulting equation of motion? 

c. For more fun write the general Lagrange equation procedure that 
takes a Lagrangian of any order, and the order, to produce the required 
equations of motion. 



1.6 How to Find Lagrangians 

Lagrange’s equations are a system of second-order differential 
equations. In order to use them to compute the evolution of a 
mechanical system we must find a suitable Lagrangian for the 
system. There is no general way to construct a Lagrangian for 
every system, but there is an important class of systems for which 
we can identify Lagrangians in a straightforward way in terms of 
kinetic and potential energy. The key idea is to construct a La- 
grangian L such that Lagrange’s equations are Newton’s equations 
F = ma. 

Suppose our system consists of N particles indexed by a, with 
mass m a and vector position x a (t). Suppose further that the 
forces acting on the particles can be written in terms of a gradient 
of a potential energy V, which is a function of the positions of 
the particles and possibly time, but which does not depend on the 
velocities. In other words, the force on particle a is F a = — V^ Q V, 
where V# a V is the gradient of V with respect to the position of 
the particle with index a. We can write Newton’s equations as 

D(m a Dx a )(t) + Vs a V(t,xo(t),...,XN-i(t)) = 0. (1.51) 

Vectors can be represented as tuples of components of the vec- 
tors on a rectangular basis. So x\ (t) is represented as the tuple 
xi(t). Let V be the potential energy function expressed in terms 
of components: 



V{t \ x 0 (t) , • . • , xyr_i (f ) ) = V(t,x 0 (t),...,x N ^i(t)). (1.52) 

Newton’s equations are 

D(m a D-x a )(t)- t-<9i iO ,F(t;x 0 (t)j . . . ,x a (t), . . . ,xjv-i (t)) = 0,(1. 53) 
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where d\_ a V is the partial derivative of V with respect to the x a (t) 
argument slot. 

To form the Lagrange equations we collect all the position 
components of all the particles into one tuple x(t), so x(t) = 
(xo (t ), . . . , xjv_i (£)). The Lagrange equations for the coordinate 
path x are 

D{d 2 LoT[x]) - (d^oTlx}) = 0. (1.54) 

Observe that Newton’s equations (1.51) are just the compo- 
nents of the Lagrange equations (1.54) if we choose L to have the 
properties 

d 2 LoT[x](t) = [m 0 Dx 0 (t ), . . . ,mjv-iT>xjv-i(i)] 

d\ L o T[x](t) = [-<9i,o V(t,x(t)), -d 1>N -i V(t,x(t))} , (1.55) 

where V(t,x(t )) = V(t\ xo(t), . . . , xjv-i(t)) and d\ >a V(t,x(t)) is 
the tuple of the components of the derivative of V with respect 
to the coordinates of the particle with index a, evaluated at time 
t and coordinates x(t). These conditions are satisfied if for every 
a A and b 0 

d 2 L(t ; a 0 , . . . ,ajv-i; bo, ... , bjv-i) 

= [m 0 b 0 , . . . ,?nAr_ibAr_i] (1.56) 

and 

d\L(t; a 0 , . . . , a^-i; b 0 , . . . , bjv-i) 

= [-9i, 0 V(t,a), V(t,a)} , (1.57) 

where a = (ao, . . . , ajv-i). We use the symbols a and b to empha- 
size that these are just formal parameters of the Lagrangian. One 
choice for L that has the required properties (1.56-1.57) is 

L(t,x,v) = - V(t,x), (1.58) 

a 

where v ^ is the sum of the squares of the components of v a . 61 



61 Remember that x and v are just formal parameters of the Lagrangian. This 
x is not the path x used earlier in the derivation, though it could be the value 
of that path at a particular time. 
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The first term is the kinetic energy, conventionally denoted T. 
So this choice for the Lagrangian is L(t, x, v ) = T(t, x,v) — V (t, x), 
the difference of the kinetic and potential energy. We will often 
extend the arguments of the potential energy function to formally 
include the velocities so that we can write L = T — V , 62 

Hamilton’s principle 

Given a system of point particles for which we can identify the 
force as the (negative) derivative of a potential energy V that is 
independent of velocity, we have shown that the system evolves 
along a path that satisfies Lagrange’s equations with L = T — V . 
Having identified a Lagrangian for this class of systems, we can 
restate the principle of stationary action in terms of energies. This 
statement is known as Hamilton’s Principle : A point-particle sys- 
tem for which the force is derived from a potential energy that 
is independent of velocity, evolves along a path q for which the 
action 

S[q\(h,t 2 )= I LoT[q] 

Jt ! 

is stationary with respect to variations of the path q that leave 
the endpoints fixed, where L = T — V is the difference between 
kinetic and potential energy. 63 



62 We can always give a function extra arguments that are not used so that it 
can be algebraically combined with other functions of the same shape. 

63 Hamilton formulated the fundamental variational principle for time- 
independent systems in 1834-1835. Jacobi gave this principle the name 
“Hamilton’s principle.” For systems subject to generic, nonstationary con- 
straints Hamilton’s principle was investigated in 1848 by Ostrogradsky. In 
the Russian literature Hamilton’s principle is often called the Hamilton- 
Ostrogradsky principle. 

William Rowan Hamilton (1805-1865) was a brilliant 19th-century mathe- 
matician. His early work on geometric optics (based on Fermat’s principle) 
was so impressive that he was elected to the post of Professor of Astronomy at 
Trinity College and Royal Astronomer of Ireland while he was still an under- 
graduate. He produced two monumental works of 19th-century mathematics. 
His discovery of quaternions revitalized abstract algebra and sparked the de- 
velopment of vector techniques in physics. His 1835 memoir “On a General 
Method in Dynamics” put variational mechanics on a firm footing, finally giv- 
ing substance to Maupertuis’s vaguely stated Principle of Least Action of 100 
years before. Hamilton also wrote poetry and carried on an extensive corre- 
spondence with Wordsworth, who advised him to put his energy into writing 
mathematics rather than poetry. 
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It might seem that we have reduced Lagrange’s equations to 
nothing more than F = ma, and indeed, the principle is motivated 
by comparing the two equations for this special class of systems. 
However, the Lagrangian formulation of the equations of motion 
has an important advantage over F = ma. Our derivation used 
the rectangular components x a of the positions of the constituent 
particles for the generalized coordinates, but if the system’s path 
satisfies Lagrange’s equations in some particular coordinate sys- 
tem, it must satisfy the equations in any coordinate system. Thus 
we see that L = T — V is suitable as a Lagrangian, with any set of 
generalized coordinates. The equations of variational mechanics 
are derived the same way in any configuration space and any co- 
ordinate system. In contrast, the Newtonian formulation is based 
on elementary geometry: in order for D 2 x{t) to be meaningful 
as an acceleration, x(t) must be a vector in physical space. La- 
grange’s equations have no such restriction on the meaning of the 
coordinate q. The generalized coordinates can be any parameters 
that conveniently describe the configurations of the system. 

Constant acceleration 

Consider a particle of mass m in a uniform gravitational field with 
acceleration g. The potential energy is mgh where h is the height 
of the particle. The kinetic energy is just \mv 2 . A Lagrangian 
for the system is the difference of the kinetic and potential en- 
ergies. In rectangular coordinates, with y measuring the vertical 
position and x measuring the horizontal position, the Lagrangian 
is L(t\ x , y\ v x , v y ) = 2 rn (y 2 + v 2 ^ — mgy. We have 

(define ( (L-unif orm-acceleration m g) local) 

(let ((q (coordinate local)) 

(v (velocity local))) 

(let ((y (ref q 1))) 

(- (* 1/2 m (square v) ) (* m g y))))) 



In addition to the formulation of the fundamental variational principle, 
Hamilton also stressed the analogy between geometric optics and mechanics, 
and stressed the importance of the momentum variables (which were earlier 
introduced by Lagrange and Cauchy), leading to the “canonical” form of me- 
chanics, which we discuss in chapter 3. 




1.6 How to Find Lagrangians 



41 



(show-expression 
( ( (Lagrange-equations 

(L-unif orm-acceleration ’m ’g)) 
(up (literal-function ’x) 
(literal-function ’y))) 

’t)) 



r rriD 2 x (t) 

_gm + mD 2 y (t ) . 

This equation describes unaccelerated motion in the horizontal 
direction ( m.D 2 x(t ) = 0) and constant acceleration in the vertical 
direction ( mD 2 y(t ) = —gm). 

Central force field 

Consider planar motion of a particle of mass m in a central force 
field, with an arbitrary potential energy U (r) depending only upon 
the distance r to the center of attraction. We will derive the La- 
grange equations for this system in both rectangular coordinates 
and polar coordinates. 

In rectangular coordinates (x,y), with origin at the center of 
attraction, the potential energy is V(t;x,y ) = U (\/ x 2 + y 2 ). The 
kinetic energy is T(t; x, y; v x , v y ) = ^m(v 2 + v 2 ). A Lagrangian 
for the system is L = T — V: 

L(t;x,y-v x ,v y ) = + v 2 ) - U(^x 2 + y 2 ). (1.59) 

As a procedure: 

(define ( (L-central-rectangular m U) local) 

(let ((q (coordinate local)) 

(v (velocity local))) 

(- (* 1/2 m (square v) ) 

(U (sqrt (square q)))))) 

The Lagrange equations are 

(show-expression 
( ( (Lagrange-equations 

(L-central-rectangular ’m (literal-function ’U))) 

(up (literal-function ’x) 

(literal-function ’y))) 

’t)) 
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We can rewrite these Lagrange equations as: 



mD 2 x{t ) = — -y-l DU(r(t )) (1.60) 

r{t) 

mD 2 y(t) = DU(r(t)), (1.61) 

where r(t) = \J ( x(t )) 2 + (y(f)) 2 . We can interpret these as fol- 
lows. The particle is subject to a radially directed force with 
magnitude —DU(r). Newton’s equations equate the force with 
the product of the mass and the acceleration. The two Lagrange 
equations are just the rectangular components of Newton’s equa- 
tions. 

We can describe the same system in polar coordinates. The 
relationship between rectangular coordinates (x, y ) and polar co- 
ordinates (r, p ) is: 

x = r cos p 

y = r sin p. (1-62) 

The relationship of the generalized velocities is derived from the 
coordinate transformation. Consider a configuration path that is 
represented in both rectangular and polar coordinates. Let x and 
y be components of the rectangular coordinate path, and let r and 
<p be components of the corresponding polar coordinate path. The 
rectangular components at time t are (x(t),y(t)), and the polar 
coordinates at time t are (r(t), £>(£)). They are related by (1.62): 

x(t) = r(t) cos <p(t) 
y(t) = r(t ) sin (p(t). 



(1.63) 





1.6 How to Find Lagrangians 



43 



The rectangular velocity at time t is (Dx(t) , Dy(t)). Differentiat- 
ing (1.63) gives the relationship among the velocities 

Dx(t ) = Dr(t ) cos £p(t) — r(t)D<p(t) sin <p(t) 

Dy(t) = Dr(t)sm!p(t) + r(t)D(p(t)sm<p(t). (1-64) 

These relations are valid for any configuration path at any mo- 
ment, so we can abstract them to relations among coordinate 
representations of an arbitrary velocity. Let v x and v y be the 
rectangular components of the velocity; and r and ip be the rate 
of change of r and ip. Then 



v x = r cos ip — rip sin ip 



v y = r sin p + rip cos ip. 


(1.65) 


The kinetic energy is \m(v 2 + v y ): 




T(t ; r, ip\ r, ip) = \m{r 2 + r 2 ip 2 ), 


(1.66) 


and the Lagrangian is 




L(t; r, ip\ r, ip) = \ m(f 2 + r 2 ip 2 ) — U(r). 


(1.67) 


We express this Lagrangian as follows: 




(define ((L-central-polar m U) local) 





(let ( (q (coordinate local)) 

(qdot (velocity local))) 

(let ((r (ref q 0)) (phi (ref q 1)) 

(rdot (ref qdot 0)) (phidot (ref qdot 1))) 
(- (* 1/2 m 

(+ (square rdot) 

(square (* r phidot))) ) 

(U r))))) 

Lagrange’s equations are: 

(show-expression 
( ( (Lagrange-equations 

(L-central-polar ’m (literal-function ’U))) 

(up (literal-function ’r) 

(literal-function ’phi))) 

*t)) 




u 
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' mD 2 r ( t ) 
. 2mDr ( t ) 


— mr ( t ) (Dip ( t . )) 2 + DU (r (t))~ 
r ( t ) Dp ( t ) + mD 2 p ( t ) (r ( t )) 2 _ 





We can interpret the first equation as expressing that the product 
of the mass and the radial acceleration is the sum of the force due 
to the potential and the centrifugal force. The second equation 
can be interpreted as saying that the derivative of the angular 
momentum mr 2 Dp is zero; so angular momentum is conserved. 64 

Note that we used the same Lagrange-equations procedure 
for the derivation in both coordinate systems. Coordinate repre- 
sentations of the Lagrangian are different for different coordinate 
systems, and the Lagrange equations in different coordinate sys- 
tems look different. Yet, the same method is used to derive the 
Lagrange equations in any coordinate system. 

Exercise 1.13: 

Check that the Lagrange equations for central force motion in polar 
coordinates and the Lagrange equations in rectangular coordinates are 
equivalent. Determine the relationship among the second derivatives 
by substituting paths into the transformation equations and computing 
derivatives, then substitute these relations into the equations of motion. 

1.6.1 Coordinate Transformations 

The motion of a system is independent of the coordinates we use to 
describe it. This coordinate- free nature of the motion is apparent 
in the action principle. The action depends only on the value of the 
Lagrangian along the path and not on the particular coordinates 
used in the representation of the Lagrangian. We can use this 
property to find a Lagrangian in one coordinate system in terms 
of a Lagrangian in another coordinate system. 

Suppose we have a mechanical system whose motion is de- 
scribed by a Lagrangian L that depends on time, coordinates, 
and velocities. And suppose we have a coordinate transformation 
F such that x = F(t, x'). The Lagrangian L is expressed in terms 
of the unprimed coordinates. We want to find a Lagrangian L' ex- 
pressed in the primed coordinates that describes the same system. 
One way to do this is to require that the value of the Lagrangian 
along any configuration path be independent of the coordinate 



64 We will talk much more about angular momentum later. 





1.6.1 Coordinate Transformations 



45 



system. If q is a path in the unprimed coordinates and q' is the 
corresponding path in primed coordinates, then the Lagrangians 
must satisfy: 

L' oT[q]= LoT[q\. (1.68) 

We have seen that the transformation from rectangular to polar 
coordinates implies that the generalized velocities transform in a 
certain way. The velocity transformation can be deduced from the 
requirement that a path in polar coordinates and a corresponding 
path in rectangular coordinates are consistent with the coordinate 
transformation. In general, the requirement that paths in two 
different coordinate systems are consistent with the coordinate 
transformation can be used to deduce how all of the components 
of the local tuple transform. Given a coordinate transformation 
F. let C be the corresponding function that maps local tuples in 
the primed coordinate system to corresponding local tuples in the 
unprimed coordinate system 

Cor[q'}=r[ q }. (1.69) 

We will deduce the general form of C below. 

Given such local tuple transformation C, a Lagrangian L' that 
satisfies equation (1.68) is 

L' = LoC. (1.70) 

We can see this by substituting L' into equation (1.68) 

L' o T[g'] = LoCo T[q } =L o T[q\. (1.71) 

To deduce the local-tuple transformation C given a coordinate 
transformation F, we deduce how each component of the local tu- 
ple transforms. Of course, the coordinate transformation specifies 
how the coordinate component of the local tuple transforms. The 
generalized velocity component of the local-tuple transformation 
can be deduced as follows. Let q and q' be the same configura- 
tion path expressed in the two coordinate systems. Substituting 
these paths into the coordinate transformation and computing the 
derivative we find 



Dq(t) = d 0 F(t, q{t )) + <9i F(t, q(t))Dq(t). 



(1.72) 
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Through any point there is always a path of any given velocity, 
so we may generalize, and conclude that along corresponding co- 
ordinate paths the generalized velocities satisfy 

v = doF(t, x') + d\F(t, x')v' . (1-73) 

If needed, rules for higher derivative components of the local tuple 
can be determined in a similar fashion. The local-tuple transfor- 
mation that takes a local tuple in the primed system to a local 
tuple in the unprimed system is constructed from the component 
transformations: 

{t, x, v, ...) = C(t, x\ v', . . .) 

= {t, F(t, x), d 0 F(t, x') + diF(t, x')v', . . .) . (1.74) 

So if we take the Lagrangian l! to be 



L' = LoC (1.75) 

then the action has a value that is independent of the coordinate 
system used to compute it. The configuration path of stationary 
action does not depend on which coordinate system is used to 
describe the path. The Lagrange equations derived from these 
Lagrangians will in general look very different from one another, 
but they must be equivalent. 

Exercise 1.14: 

Show by direct calculation that the Lagrange equations for I! are satis- 
fied if the Lagrange equations for L are satisfied. 

Given a coordinate transformation F . we can use equation (1.74) 
to find the function C . which transforms local tuples. The proce- 
dure F->C implements this 6 ' 5 

(define ( (F->C F) local) 

(->local (time local) 

(F local) 

(+ (((partial 0) F) local) 

(* (((partial 1) F) local) 

(velocity local))))) 



65 As described in footnote 28 the procedure ->local constructs a local tuple 
from an initial segment of time, coordinates, and velocities. 
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As an illustration, consider the transformation from polar to 
rectangular coordinates: x = rcos</? and y = r sin ip, with the 
following implementation: 

(define (p->r local) 

(let ((polar-tuple (coordinate local))) 

(let ((r (ref polar-tuple 0)) 

(phi (ref polar-tuple 1))) 

(let ((x (* r (cos phi))) 

(y (* r (sin phi)))) 

(up x y))))) 

In terms of the polar coordinates and the rates of change of the po- 
lar coordinates, the rates of change of the rectangular components 
are: 

(show-expression 
(velocity 
( (F->C p->r) 

(->local ’t (up ’r ’phi) (up ’rdot ’phidot))))) 




We can use F->C to find the Lagrangian for central force motion in 
polar coordinates from the Lagrangian in rectangular components, 
using equation (1.70), 



(define (L-central-polar m U) 

(compose (L-central-rectangular m U) (F->C p->r))) 

(show-expression 

((L-central-polar ’m (literal-function ’U)) 
(->local ’t (up ’r ’phi) (up ’rdot ’phidot)))) 



1 -22 

r 



-L . O 

+ -mr 



U (r) 



The result is the same as Lagrangian (1.67). 



Exercise 1.15: Central force motion 

Find Lagrangians for central force motion in three dimensions in rect- 
angular coordinates and in spherical coordinates. First, find the La- 






48 



Chapter 1 Lagrangian Mechanics 



grangians analytically, then check the results with the computer by gen- 
eralizing the programs that we have presented. 

1.6.2 Systems with Rigid Constraints 

We have found that L = T — V is a suitable Lagrangian for a 
system of point particles subject to forces derived from a potential. 
Extended bodies can sometimes be conveniently idealized as a 
system of point particles connected by rigid constraints. We will 
find that L = T — V , expressed in irredundant coordinates, is a 
suitable Lagrangian for modeling systems of point particles with 
rigid constraints. We will first illustrate the method and then 
provide a justification. 

Lagrangians for rigidly constrained systems 

The system is presumed to be made of N point masses, indexed by 
a, in ordinary three-dimensional space. The first step is to choose 
a convenient set of irredundant generalized coordinates q and re- 
describe the system in terms of these. In terms of the generalized 
coordinates the rectangular coordinates of particle a is: 

x a = fa(t,q). (1.76) 

For irredundant coordinates q all the coordinate constraints are 
built into the functions f a . We deduce the relationship of the 
generalized velocities v to the velocities of the constituent particles 
v a by inserting path functions into equation (1.76), differentiating, 
and abstracting to arbitrary velocities. 66 We find 

v a = dof a (t,q) + dif a (t,q)v. (1-77) 

We use equations (1.76) and (1.77) to express the kinetic energy 
in terms of the generalized coordinates and velocities. Let T be 
the kinetic energy as a function of the rectangular coordinates and 
velocities: 

T(t;x 0 , . . . ,xjv-i; v 0 , . . . , vjv-i) = ^2 (1-78) 

a 

where v a is the squared magnitude of v a . As a function of the 
generalized coordinate tuple q and the generalized velocity tuple 
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See section 1.6.1. 
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Figure 1.2 The pendulum is driven by vertical motion of the pivot. 
The pivot slides on the y-axis. Although the bob is drawn as a blob 
it is modeled as a point mass. The bob is acted on by the uniform 
acceleration g of gravity in the negative y direction. 



v the kinetic energy is 

T(t, q, v ) = T(t, f(t, q),dof(t, q) + <9i f(t, q)v) 

= ^2 rn a {d 0 f a {t , q) + di f Q (t, q)v) 2 . (1.79) 

a 

Similarly, we use equation (1.76) to reexpress the potential en- 
ergy in terms of the generalized coordinates. Let V(t,x) be the 
potential energy at time t in the configuration specified by the 
tuple of rectangular coordinates x. Expressed in generalized co- 
ordinates the potential energy is 

V(t,q,v) = V(t,f(t,q)). (1.80) 

We take the Lagrangian to be the difference of the kinetic energy 
and the potential energy: L = T — V . 

A pendulum driven at the pivot 

Consider a pendulum (see figure 1.2) of length l and mass m, 
modeled as a point mass, supported by a pivot that is driven in 
the vertical direction by a given function of time y s . 
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The dimension of the configuration space for this system is one; 
we choose 6 , shown in figure 1.2, as the generalized coordinate. 
The position of the bob is given, in rectangular coordinates, by 

x = lsm6 and y = y s (t) — l cos 0. (1.81) 

The velocities are 

v x = 10 cos 0 and v y = Dy s (t) + WsinO, (1-82) 

obtained by differentiating along a path and abstracting to veloc- 
ities at the moment. 

The kinetic energy is T(t; x, y\ v x , v y ) = ^m(v^. + Vy). Expressed 
in generalized coordinates the kinetic energy is 

T(t, 0, 0) = ( l 2 0 2 + ( Dy s (t )) 2 + 2 lDy s (t)6 sin 6) . (1.83) 

The potential energy is V(t;x,y) = rrigy. Expressed in gener- 
alized coordinates the potential energy is 

V (■ t , 0 , 9) = gm ( y s (t ) — l cos 0) . (1-84) 

A Lagrangian is L = T — V. 

The Lagrangian is expressed as 

(define ( (T-pend m 1 g ys) local) 

(let ((t (time local)) 

(theta (coordinate local)) 

(thetadot (velocity local))) 

(let ((vys (D ys))) 

(* 1/2 m 

(+ (square (* 1 thetadot)) 

(square (vys t)) 

(* 2 1 (vys t) thetadot (sin theta))))))) 

(define ( (V-pend m 1 g ys) local) 

(let ((t (time local)) 

(theta (coordinate local))) 

(* m g (- (ys t) (* 1 (cos theta)))))) 

(define L-pend (- T-pend V-pend)) 




1.6.2 Systems with Rigid Constraints 



51 



Lagrange’s equation for this system is 67 

(show-expression 
( ( (Lagrange-equations 

(L-pend ’m ’1 ’g (literal-function ’y_s))) 
(literal-function ’theta)) 

’t)) 



D 2 6 (■ t ) l 2 m + D 2 y s (t) sin {9 (t)) Im + sin ( 9 (t)) glm 



Exercise 1.16: 

Derive the Lagrangians in exercise 1.9. 

Exercise 1.17: Bead on a helical wire 

A bead of mass in is constrained to move on a frictionless helical wire. 
The helix is oriented so that its axis is horizontal. The diameter of the 
helix is d and its pitch (turns per unit length) is h. The system is in 
a uniform gravitational field with vertical acceleration g. Formulate a 
Lagrangian that describes the system and find the Lagrange equations 
of motion. 



Exercise 1.18: Bead on a triaxial surface 

A bead of mass m moves without friction on a triaxial ellipsoidal surface. 
In rectangular coordinates the surface satisfies 




(1.85) 



for some constants a, 6, and c. Identify suitable generalized coordinates, 
formulate a Lagrangian, and find Lagrange’s equations. 



Exercise 1.19: A two-bar linkage 

The two-bar linkage shown in figure 1.3 is constrained to move in the 
plane. It is composed of three small massive bodies interconnected by 
two massless rigid rods in a uniform gravitational field with vertical 
acceleration g. The rods are pinned to the central body by a hinge that 
allows the linkage to fold. The system is arranged so that the hinge is 
completely free: the members can go through all configurations without 



67 We hope you appreciate the TgXmagic here. A symbol with a underline char- 
acter is converted by show-expression to a subscript. Symbols with carets, 
the names of Greek letters, and terminating in the characters ’’dot” are simi- 
larly mistreated. 
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Figure 1.3 A two-bar linkage is modeled by three point masses con- 
nected by rigid massless struts. This linkage is subject to a uniform 
vertical gravitational acceleration. 



m i 




Figure 1 .4 This pendulum is pivoted on a point particle of mass mi 
that is allowed to slide on a horizontal rail. The pendulum bob is a point 
particle of mass m 2 that is acted on by the vertical force of gravity. 



collision. Formulate a Lagrangian that describes the system and find 
the Lagrange equations of motion. Use the computer to do this, because 
the equations are rather big. 



Exercise 1.20: Sliding pendulum 

Consider a pendulum of length l attached to a support that is free to 
move horizontally, shown in figure 1.4. Let the mass of the support be 
mi and the mass of the pendulum be m 2 . Formulate a Lagrangian and 
derive Lagrange’s equations for this system. 

Why it works 

In this section we show that L = T — V is in fact a suitable 
Lagrangian for rigidly constrained systems. We do this by requir- 
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ing that the Lagrange equations be equivalent to the Newtonian 
vectorial dynamics with vector constraint forces. 68 

We consider a system of particles. The particle with index a has 
mass m a and position x a (t) at time t. There may be a very large 
number of these particles, or just a few. Some of the positions 
may also be specified functions of time, such as the position of the 
pivot of a driven pendulum. There are rigid position constraints 
among some of the particles; we assume all of these constraints 
are of the form 

(x a (t) - xp(t)) ■ (x a (t) - xp(t)) = lip , (1.86) 

that is, the distance between particles a and (3 is l a p. 

The Newtonian equation of motion for particle a says that the 
mass times the acceleration of particle a is equal to the sum of the 
potential forces and the constraint forces. The potential forces are 
derived as the negative gradient of the potential energy, and may 
depend on the positions of the other particles and the time. The 
constraint forces F a p are the vector constraint forces associated 
with the rigid constraint between particle a and particle (3. So 

D(m a Dx a )(t ) 

= -'Vg a V(t,x Q (t),...: i x N - 1 (t)) + ^2 F<x/ 3 (t), (1.87) 

{0\ P^ot} 

where in the summation f3 ranges over only those particle indices 
for which there are rigid constraints with the particle indexed by 
a; we use the notation /3 a for the relation that there is a rigid 
constraint between the indicated particles. 



68 We will simply accept the Newtonian procedure for systems with rigid con- 
straints and find Lagrangians that are equivalent. Of course, actual bodies are 
never truly rigid, so we may wonder what detailed approximations have to be 
made to treat them as truly rigid. For instance, a more satisfying approach 
would be to replace the rigid distance constraints by very stiff springs. We 
could then immediately write the Lagrangian as L = T — V, and we should 
be able to derive the Newtonian procedure for systems with rigid constraints 
as an approximation. However, this is too complicated to do at this stage, so 
we accept the Newtonian idealization. 
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The force of constraint is directed along the line between the 
particles, so we may write 

F a g(t) = ZM1 (1.88) 

ta/3 

where F a g{t) is the scalar magnitude of the tension in the con- 
straint at time t. Note that F a g = —Fg a . In general, the scalar 
constraint forces change as the system evolves. 

Formally, we can reproduce Newton’s equations with the La- 
grangian 69 



L(t- x , F: x, F) = \m a -k 2 a - V (t, x ) 

CX. 

Y - x «) 2 - U ( L89 ) 

{a,f3\a</3,a^>/3} a ^ 

where the constraint forces are being treated as additional gener- 
alized coordinates. Here x is a structure composed of all of the 
rectangular components x a of all of the x a , x is a structure com- 
posed of all the rectangular components x Q of all of the velocity 
vectors v a , and F is a structure composed of all of the F a g. The 
velocity of F does not appear in the Lagrangian, and F itself only 
appears linearly. So the Lagrange equations associated with F are 

( x / 3 (t) ~ x«(t)) 2 - = 0 (1.90) 

but this is just a restatement of the constraints. The Lagrange 
equations for the particle coordinates are Newton’s equations (1.87) 



D(mDx a )(t) 



~di, a V(t,x(t)) 



+ 



'y ] F a g(t) 

{/3|a<->/3} 



Xg(t) - X a (f) 
^a/3 



(1.91) 



69 This Lagrangian is purely formal and does not represent a model of the 
constraint forces. In particular, note that the constraint terms do not look 
like a potential of constraint with a minimum when the constraint is exactly 
satisfied. Rather, the constraint terms in the Lagrangian are zero when the 
constraint is satisfied, and can be either positive or negative depending on 
whether the distance between the particles is larger or smaller than the con- 
straint distance. 




1.6.2 Systems with Rigid Constraints 



55 



Now that we have a suitable Lagrangian, we can use the fact 
that Lagrangians can be reexpressed in any generalized coordi- 
nates to find a simpler Lagrangian. The strategy is to choose 
a new set of coordinates for which many of the coordinates are 
constants and the remaining coordinates are irredundant. 

Let q be a tuple of generalized coordinates that specify the de- 
grees of freedom of the system without redundancy. Let c be a 
tuple of other generalized coordinates that specify the distances 
between particles for which constraints are specified. The c co- 
ordinates will have constant values. The combination of q and c 
replace the redundant rectangular coordinates x. 70 In addition, 
we still have the F coordinates, which are the scalar constraint 
forces. Our new coordinates are the components of q. c, and F. 

There exist functions f a that give the rectangular coordinates 
of the constituent particles in terms of q and c: 

x a = f a (t,q,c). (1-92) 

To reexpress the Lagrangian in terms of q, c, and F we need to 
find v a in terms of the generalized velocities q and c: we do this 
by differentiating f a along a path and abstracting to arbitrary 
velocities (see section 1.6.1): 

v a = d 0 f a (t, q, c ) + dif a (t, q,c) q + d 2 f a (t , q, c ) c. (1.93) 

Substituting these into Lagrangian (1.89), and using 

4/3 = ( x /3 - x«) 2 , (1-94) 

we find 

L'(t;q,c,F;q,c, F) 

= Y (do fait, q, c ) + dif a (t, q,c ) q + d 2 f a {t , q, c) c) 2 

a 

-V(t,f(t,q,c))- ^[4/3 ( 1 -95) 

{o',/3|a</3,a<-^/3} a ^ 



70 Typically the number of components of x is equal to the sum of the number 
of components of q and c; adding a strut removes a degree of freedom and 
adds a distance constraint. However, there are singular cases for which the 
addition of single strut can remove more than a single degree of freedom. We 
do not consider the singular cases here. 
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The Lagrange equations are derived by the usual procedure. 
Rather than write out all the gory details, let’s think about how 
it will go. 

The Lagrange equations associated with F just restate the con- 
straints: 

0 = c 2 ap (t) - l 2 a p (1.96) 

and consequently we know that along a solution path c(t) = l, 
and Dc(t) = D 2 c{t) = 0. We can use this result to simplify the 
Lagrange equations associated with q and c. 

The Lagrange equations associated with q are the same as if 
they were derived from the Lagrangian 71 

L"(t, q,q)=^2 (do fa(t, q, l ) + dif a (t, q, l ) qf 

a 

— V (t, fit, q, l)), (1.97) 

but this is exactly T — V where T and V are computed from the 
generalized coordinates q. with fixed constraints. Notice that the 
constraint forces do not appear in the Lagrange equations for q 
because in the Lagrange equations they are multiplied by a term 
that is identically zero on the solution paths. So the Lagrange 
equations for T — V with irredundant generalized coordinates q 
and fixed constraints are equivalent to Newton’s equations with 
vector constraint forces. 

The Lagrange equations for c can be used to find the constraint 
forces. The Lagrange equations are a big mess so we will not show 
them explicitly, but in general they are equations in D 2 c , Dc , and 
c that will depend upon q, Dq, and F. The dependence on F is 
linear, so we can solve for F in terms of the solution path q and 
Dq, with c = / and Dc = D 2 c = 0. 

If we are not interested in the constraint forces, we can abandon 
the full Lagrangian (1.95) in favor of Lagrangian (1.97), which is 



71 Consider a function g of, say, three arguments, and let go be a function of two 
arguments satisfying g 0 (x,y) = g(x,y,0). Then (d 0 go)(x,y) = (d 0 g)(x, y, 0). 
The substitution of a value in an argument commutes with the taking of 
the partial derivative with respect to a different argument. In deriving the 
Lagrange equations for q we can set c = l and c = 0 in the Lagrangian, but we 
cannot do this in deriving the Lagrange equations associated with c, because 
we have to take derivatives with respect to those arguments. 
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equivalent as far as the evolution of the generalized coordinates q 
is concerned. 

The same derivation goes through even if the lengths l a p speci- 
fied in the interparticle distance constraints are a function of time. 
It can also be generalized to allow distance constraints to be time- 
dependent positions, by making some of the positions of particles 
xp specified functions of time. 

More generally 

Consider a constraint of the form 

ip(t,x(t)) = 0, (1.98) 

where x(t) is the structure of all the rectangular components x Q (f) 
at time t. In section 1.10 we will show, using the variational 
principle, that an appropriate Lagrangian for this system is 

L(t\ x, A; x, A) = ^ \m a ± 2 a — V(t,x) + A ip(t,x), (1.99) 

a 

where A is an additional coordinate and A is the corresponding 
generalized velocity. The Lagrange equations associated with A 
are just a restatement of the constraints: ip(t,x(t)) = 0. The 
Lagrange equations for the particle coordinates are: 

D(m a Dx. a )(t) = -di :Q V(t,x(t)) + \(t)di, a ip(t,x(t)). (1.100) 

Such a constraint can also be modeled by including appropriate 
constraint forces in Newton’s equations: 

D(m a Dx a )(t ) = -'Vs a V(t-,x 0 (t) . ..x N -i(t)) + ^ F a (t). (1.101) 

a 

For equations (1.100) to be the same as equations (1.101) we must 
identify X(t)di jC dp(t, x(t)) with the forces of constraint on particle 
a. Notice that these forces of constraint are proportional to the 
normal to the constraint surface at each instant and thus do no 
work for motions that obey the constraint. 

Lagrangian (1.89), which we developed above to include New- 
tonian forces of constraint for position constraints, is exactly of 
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xi,yi 



m 0 

xo,Vo 

Figure 1.5 A rigid rod of length l constrains two massive particles in 
a plane. 

this form. We can identify 

A x(t)) = [( X /?W - - lip] . 

{a,0\a</3,a^/3} a ^ 

(1.102) 

The forces of constraint satisfy 

\(t)d 1 ,M^x(t))= ]T (1.103) 

01/3 

Accepting Lagrangian (1.99) as describing systems with con- 
straints of the form (1.98), we can make a coordinate transforma- 
tion from the redundant coordinates x to irredundant generalized 
coordinates q and constraint coordinates c = ip(t,x), as above. 
The coordinate A will not appear in the Lagrange equations for 
q because on solution paths they will be multiplied by a factor 
that is identically zero. If we are interested only in the evolution 
of the generalized coordinates we can assume the constraints are 
identically satisfied and take the Lagrangian to be the difference 
of the kinetic and potential energies expressed in terms of the 
generalized coordinates. 

Exercise 1.21: The dumbbell 

In this exercise we will recapitulate the derivation of the Lagrangian for 
constrained systems for a particular simple system. 

Consider two massive particles in the plane constrained by a massless 
rigid rod to remain a distance l apart, as in figure 1.5. There are appar- 
ently four degrees of freedom for two massive particles in the plane, but 
the rigid rod reduces this number to three. 
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We can uniquely specify the configuration with the redundant coor- 
dinates of the particles, say xo(t),yo(t) and Xi(t),yi(t). The constraint 
— xo (t)) 2 + (yi{t) — yo{t)) 2 = l 2 eliminates one degree of freedom. 

a. Write Newton’s equations for the balance of forces for the four rect- 
angular coordinates of the two particles, given that the scalar tension in 
the rod is F. 

b. Write the formal Lagrangian 
L{t\ x 0 ,yo,xi,yi,F-, x 0 ,yo,xi,yi,F) 

such that Lagrange’s equations will yield the Newton’s equations that 
you derived in part a. 

c. Make a change of coordinates to a coordinate system with center of 
mass coordinates x C m, Vcm , angle 0, distance between the particles c, and 
tension force F. Write the Lagrangian in these coordinates, and write 
the Lagrange equations. 

d. You may deduce from one of these equations that c(t) = l. From 
this fact we get that Dc = 0 and D 2 c = 0. Substitute these into the 
Lagrange equations you just computed to get equation of motion for 

xcm j y cm) 9- 

e. Make a Lagrangian (=T—V) for the system described with the irre- 
dundant generalized coordinates x CM ,y CM ,0 and compute the Lagrange 
equations from this Lagrangian. They should be the same equations as 
you derived for the same coordinates from part d. 

Exercise 1.22: Driven pendulum 

Show that the Lagrangian (1.89) can be used to describe the driven 
pendulum, where the position of the pivot is a specified function of 
time: Derive the equations of motion using the Newtonian constraint 
force prescription, and show that they are the same as the Lagrange 
equations. Be sure to examine the equations for the constraint forces as 
well as the position of the pendulum bob. 

Exercise 1.23: Fill in the details 

Show that the Lagrange equations for Lagrangian (1.97) are the same 
as the Lagrange equations for Lagrangian (1.95) with the substitution 
c(t) = l, Dc(t) = D 2 c(t) = 0. 

Exercise 1.24: Constraint forces 

Find the tension in an undriven planar pendulum. 
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1.6.3 Constraints as Coordinate Transformations 

The derivation of a Lagrangian for a constrained system involves 
steps that are analogous to the steps in the derivation of a coor- 
dinate transformation. 

For a constrained system one specifies the rectangular coordi- 
nates of the constituent particles in terms of generalized coordi- 
nates that incorporate the constraints. We then determine the 
rectangular velocities of the constituent particles as functions the 
generalized coordinates and the generalized velocities. The La- 
grangian that we know how to express in rectangular coordinates 
and velocities of the constituent particles can then be reexpressed 
in terms of the generalized coordinates and velocities. 

To carry out a coordinate transformation one specifies how the 
configuration of a system expressed in one set of generalized coor- 
dinates can be reexpressed in terms of another set of generalized 
coordinates. We then determine the transformation of general- 
ized velocities implied by the transformation of generalized coor- 
dinates. A Lagrangian that is expressed in terms of one of the 
sets of generalized coordinates can then be reexpressed in terms 
of the other set of generalized coordinates. 

These are really two applications of the same process, so we 
can make Lagrangians for constrained systems by composing a 
Lagrangian for unconstrained particles with a coordinate trans- 
formation that incorporates the constraint. Our deduction that 
L = T — V is a suitable Lagrangian for a constrained systems was 
in fact based on a coordinate transformation from a set of coor- 
dinates subject to constraints to a set of irredundant coordinates 
plus constraint coordinates that are constant. 

Let x a be the tuple of rectangular components of the con- 
stituent particle with index a, and v a be its velocity. The La- 
grangian 

A/(t;x o, . . . ,xjv-i; vo, . . . , vjy-i) 

= ^2 “ w (i; x o, • • . ,xjv-i; V 0 , • ■ J v.v- 1 ) (1.104) 

a 

is the difference of kinetic and potential energies of the constituent 
particles. This is a suitable Lagrangian for a set of unconstrained 
free particles with potential energy V. 

Let q be a tuple of irredundant generalized coordinates, and v 
be the corresponding generalized velocity tuple. The coordinates 




1.6.3 Constraints as Coordinate Transformations 



61 



q are related to x Q , the coordinates of the constituent particles, by 
x a = fa(t,q), as before. The constraints among the constituent 
particles are taken into account in the definition of the f a . Here 
we view this as a coordinate transformation. What is unusual 
about this as a coordinate transformation is that the dimension 
of x is not the same as the dimension of q. From this coordinate 
transformation we can find the local-tuple transformation function 
(see section 1.6.1) 

(t;x 0 , . . . ,xjv-i; v 0 , . . . , vjv-i) = C(t,q,v). (1.105) 

A Lagrangian for the constrained system can be obtained from 
the Lagrangian for the unconstrained system by composing it with 
the local-tuple transformation function from constrained coordi- 
nates to unconstrained coordinates: 

L = L f oC. (1.106) 

The constraints enter only in the transformation. 

To illustrate this we will find a Lagrangian for the driven pen- 
dulum introduced in section 1.6.2. The T—V Lagrangian for a free 
particle of mass m in a vertical plane subject to a gravitational 
potential with acceleration g is 

Lf(t;x,y;v x ,v y ) = \m(yl + vl) -mgy, (1.107) 

where y measures the vertical height of the point mass. As a 
program 

(define ( (Lf m g) local) 

(let ((q (coordinate local)) 

(v (velocity local))) 

(let ((y (ref q 1))) 

(- (* 1/2 m (square v) ) (* m g y) ) ) ) ) 

The coordinate transformation from generalized coordinate 6 to 
rectangular coordinates is x = l sin 9, y = y s (t ) — l cos 9, where l is 
the length of the pendulum and y s gives the height of the support 
as a function of time. It is interesting that the drive enters only 
through the specification of the constraints. As a program 
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(define ((dp-coordinates 1 y_s) local) 

(let ((t (time local)) 

(theta (coordinate local))) 

(let ((x (* 1 (sin theta))) 

(y (- (y_s t) (* 1 (cos theta))))) 

(up x y)))) 

Using F->C we can deduce the local-tuple transformation and de- 
fine the Lagrangian for the driven pendulum by composition: 

(define (L-pend m 1 g y_s) 

(compose (Lf m g) 

(F->C (dp-coordinates 1 y_s)))) 

The Lagrangian is 

(show-expression 

((L-pend ’m ’1 ’g (literal-function ’y_s)) 

(->local ’t ’theta ’thetadot))) 



glmcos ( 9)—gmy s (t)+^l 2 m6 2 +lmODy s ( t ) sin (6 , ) + ^m ( Dy s ( t )) 2 



This is the same as the Lagrangian found in section 1.6.2. 

We have found a very interesting decomposition of the La- 
grangian for constrained systems. One part consists of the dif- 
ference of the kinetic and potential energy of the constituents. 
The other part describes the constraints that are specific to the 
configuration of a particular system. 

1.6.4 The Lagrangian is Not Unique 

Lagrangians are not in a one-to-one relationship with physical 
systems — many Lagrangians can be used to describe the same 
physical system. In this section we will demonstrate this by show- 
ing that the addition to the Lagrangian of a “total time deriva- 
tive” of a function of the coordinates and time does not change 
the paths of stationary action or the equations of motion deduced 
from the action principle. 
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Total time derivatives 

Let’s first explain what we mean by a “total time derivative.” Let 
F be a function of time and coordinates. Then the time derivative 
of F along a path q is 

fl(For[?]) = {DF oT[q])DT[q\. (1.108) 

Because F only depends on time and coordinates: 

DFoT[q\ = [SoTor^^For^]] . (1.109) 

So we only need the first two components of DT[q], 

(DT[q])(t) = (1, Dq(t), D 2 q(t ), . . .) , (1.110) 

to form the product 

D(F o r[ 9 ]) = d 0 F o r[ 9 ] + (ft F o T[q])Dq 

= (d 0 F+(d 1 F)Q)oT[q], (1.111) 

where Q = h is a selector function: 72 c = Q(a,b,c), so Dq = 
Q o r[</]. The function 

D t F = d 0 F + (diF)Q (1.112) 

is called the total time derivative of F; it is a function of three 
arguments: the time, the generalized coordinates, and the gener- 
alized velocities. 

In general, the total time derivative of a local-tuple function F 
is that function DfF that when composed with a local-tuple path 
is the time derivative of the composition of the function F with 
the same local-tuple path: 

D t FoT[q} = D(FoT[q\). (1.113) 

The total time derivative DfF is explicitly given by 

D t F(t, q, v,a,...) = d 0 F(t, q, v,a ,.. .) 

+ di F(t,q,v,a,...)v 

+ 02 F(t,q,v,a,...)a-{ , (1.114) 

72 Components of a tuple structure, such as the value of r[g](t) can be selected 
with selector functions: Ii gets the element with index i from the tuple. 
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where we take as many terms as needed to exhaust the arguments 
of F. 

Exercise 1.25: Properties of D t 

The total time derivative D t F is not the derivative of the function F. 
Nevertheless, the total time derivative shares many properties with the 
derivative. Demonstrate that D t has the following properties for local- 
tuple functions F and G, number c, and a function H with domain 
containing the range of G. 

a. D t (F + G) = D t F + D t G 

b. D t (cF) = cD t F 

c. D t (FG) =FD t G+{D t F)G 

d. D t (H o G) = ( DH o G)D t G. 

Adding total time derivatives to Lagrangians 

Consider two Lagrangians L and L' that differ by the addition of 
a total time derivative of a function F that depends only on the 
time and the coordinates 

L' = L + D t F. (1.115) 

The corresponding action integral is 

S'[q\{ti,t 2 ) = [ L' oT[q\ 

Jt i 

= J t2 (L + D t F)oT[q] 

= f 2 LoT[q}+ [ t2 D(FoT[q}) 

Jt i Jt i 

= S[q](t 1 ,t 2 )+(FoT[q])\%. (1.116) 

The variational principle states that the action integral along a 
realizable trajectory is stationary with respect to variations of the 
trajectory that leave the configuration at the endpoints fixed. The 
action integrals 5[g](ti, t 2 ) and <5 / [q r ](ti,i2) differ by a term 

(F o T[g])|£ = F(t 2 , q(t 2 )) - F{t\,q{t\j) (1.117) 

that depends only on the coordinates and time at the endpoints 
and these are not allowed to vary. Thus, if S[q](ti,t 2 ) is stationary 
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for a path, then S'[q\(ti,t 2 ) will also be stationary. So either 
Lagrangian can be used to distinguish the realizable paths. 

The addition of a total time derivative to a Lagrangian does 
not affect whether the action is critical for a given path. So if we 
have two Lagrangians that differ by a total time derivative the 
corresponding Lagrange equations are equivalent in that the same 
paths satisfy each. Moreover, the additional terms introduced into 
the action by the total time derivative only appear in the endpoint 
condition and thus do not affect the Lagrange equations derived 
from the variation of the action, so the Lagrange equations are the 
same. So the Lagrange equations are not changed by the addition 
of a total time derivative to a Lagrangian. 

Exercise 1.26: Lagrange equations for total time derivatives 

Let F(t, q) be a function of t and q only, with total time derivative 

D t F = d 0 F + d q FQ. (1.118) 

Show explicitly that the Lagrange equations for D t F are identically zero, 
and thus that the addition of D t F to a Lagrangian does not affect the 
Lagrange equations. 

The driven pendulum provides a nice illustration of adding total 
time derivatives to Lagrangians. The equation of motion for the 
driven pendulum (see section 1.6.2), 

ml 2 D 2 9(t) + ml(g + D 2 y s (t)) sin 0{t) = 0, (1.119) 

has an interesting and suggestive interpretation: it is the same as 
the equation of motion of an undriven pendulum, except that the 
acceleration of gravity g is augmented by the acceleration of the 
pivot D 2 y s . This intuitive interpretation was not apparent in the 
Lagrangian derived as the difference of the kinetic and potential 
energies in section 1.6.2. However, we can write an alternate La- 
grangian that has the same equations of motion that is as easy to 
interpret as the equation of motion: 

L'(t, 9, 9) = \ ml 2 9 2 + ml(g + D 2 y s (t )) cos 9. (1.120) 

With this Lagrangian it is apparent that the effect of the acceler- 
ating pivot is to modify the acceleration of gravity. Note, however, 
that it is not the difference of the kinetic and potential energies. 
Let’s compare the two Lagrangians for the driven pendulum. The 




66 



Chapter 1 Lagrangian Mechanics 



difference A L = L — L' is 

A L(t,9,6) = \m(Dy s (t)) 2 + mlDy s (t)6smO 

— gmy s (t ) — mlD 2 y s (t) cos 9. (1.121) 

The two terms in A L that do not depend on either 9 or 8 do not 
affect the equations of motion. The remaining two terms are the 
total time derivative of the function F(t,0) = — mlDy s (t) cos 9, 
which does not depend on 9. The addition of such terms to a 
Lagrangian does not affect the equations of motion. 

Identification of total time derivatives 

If the local-tuple function G, with arguments ( t,q,v ), is the total 
time derivative of a function F, with arguments (t,q), then G 
must have certain properties. 

From equation (1.112), we see that G must be linear in the 
generalized velocities 

G(t,q,v) = Gi(t,q,v) v + G 2 (t,q,v) (1.122) 

where neither G\ nor G- 2 depend on the generalized velocities: 
d 2 G\ = d 2 G 2 = 0. 

If G is the total time derivative of F then G\ = d\F and G 2 = 
do F, so 

doG\ = dod\ F 

d\G 2 = did 0 F. (1.123) 

The partial derivative with respect to the time argument does 
not have structure, so dodi F = dod\F. So if G is the total time 
derivative of F then 

d 0 Gi = diG 0 . (1.124) 

Furthermore, G\ = d\ F, so 

diGi = didi F. (1.125) 

If there is more than one degree of freedom these partials are 
actually structures of partial derivatives with respect to each co- 
ordinate. The partial derivatives with respect to two different 
coordinates must be the same independent of the order of the 
differentiation. So diG*i must be symmetric. 
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Note that we have not shown that these conditions are sufficient 
for determining that a function is a total time derivative, only that 
they are necessary. 

Exercise 1.27: Identifying total time derivatives 

For each of the following functions, either show that it is not a total 
time derivative or produce a function from which it can be derived. 

a. G(t,x,v x ) = mv x 

b. G(t, x, v x ) = mv x cos t 

c. G(t,x,v x ) = v x cost — xsint 

d. G(t, x, v x ) = v x cos t + a; sin t 

e. G(t ; x, y\ v x , v y ) = 2(xv x + yv y ) cos t — ( x 2 + y 2 ) sinf 

f. G(t; x, y; v x , v y ) = 2(xv x + yv v ) cos t— ( x 2 + y 2 ) sint + y 3 v x + xv y 

1.7 Evolution of Dynamical State 

Lagrange’s equations are ordinary differential equations that the 
path must satisfy. They can be used to test if a proposed path is 
a realizable path of the system. However, we can also use them 
to develop a path, starting with initial conditions. 

The state of a system is defined to be the information that 
must be specified for the subsequent evolution to be determined. 
Remember our juggler: he or she must throw the pin in a cer- 
tain way for it to execute the desired motion. The juggler has 
control of the initial position and orientation of the pin, and the 
initial velocity and spin of the pin. Our experience with juggling 
and similar systems suggests that the initial configuration and the 
rate of change of the configuration are sufficient to determine the 
subsequent motion. Other systems may require higher derivatives 
of the configuration. 

For Lagrangians that are written in terms of a set of generalized 
coordinates and velocities we have shown that Lagrange’s equa- 
tions are second-order ordinary differential equations. If the dif- 
ferential equations can be solved for the highest-order derivatives 
and if the differential equations satisfy appropriate conditions 73 



73 For example, the Lipschitz condition is that the rate of change of the deriva- 
tive is bounded by a constant in an open set around each point of the trajec- 
tory. See [22] for a good treatment of the Lipschitz condition. 
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then there is a unique solution to the initial-value problem: given 
values of the solution and the lower derivatives of the solution at 
a particular moment there is a unique solution function. Given 
irredundant coordinates the Lagrange equations satisfy these con- 
ditions.' 1 Thus a trajectory is determined by the generalized co- 
ordinates and the generalized velocities at any time. This is the 
information required to specify the dynamical state. 

A complete local description of a path consists of the path and 
all of its derivatives at a moment. The complete local descrip- 
tion of a path can be reconstructed from an initial segment of 
the local tuple, given a prescription for computing higher-order 
derivatives of the path in terms of lower-order derivatives. The 
state of the system is specified by that initial segment of the local 
tuple from which the rest of the complete local description can be 
deduced. The complete local description gives us the path near 
that moment. Actually, all we need is a rule for computing the 
next higher derivative; we can get all the rest from this. Assume 
that the state of a system is given by the tuple (f, q, v ). If we are 
given a prescription for computing the acceleration a = A(t , q, v), 
then 

D 2 q = AoT[q], (1.126) 

and we have as a consequence 

D 3 q = D(AoT[q])=D t AoT[q], (1.127) 

and so on. So the higher derivative components of the local tuple 
are given by functions DtA, D^A, . . .. Each of these functions 
depends on lower derivative components of the local tuple. All we 
need to deduce the path from the state is a function that gives 
the next higher derivative component of the local description from 
the state. We use the Lagrange equations to find this function. 



74 If the coordinates are redundant we cannot, in general solve for the highest- 
order derivative. However, since we can transform to irredundant coordinates, 
and since we can solve the initial- value problem in the irredundant coordinates, 
and since we can construct the redundant coordinates from the irredundant 
coordinates, we can in general solve the initial-value problem for redundant 
coordinates. The only hitch is that we may not specify arbitrary initial con- 
ditions: the initial conditions must be consistent with the constraints. 
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First, we expand the Lagrange equations 
d\L o r[g] = D{d 2 LoT[q}) 

so that the second derivative appears explicitly 
9\ L o r[g] 

= <9 0 c>2 L o T [q\ + (di 8 2 L o r[g]) Dq + (d 2 d 2 L o T[g]) D 2 q. 

Solving this system for D 2 q one obtains the generalized accelera- 
tion along a solution path q 

D 2 q = 

[d 2 d 2 L o r[g]] -1 [d\L o T[q\ - (d±d 2 L o T[q\) Dq - d 0 d 2 L o T[g]] 

where [d 2 d 2 L oT\~ l is the inverse of the Hessian matrix. The 
function that gives the acceleration is 

A = (d 2 d 2 L)~ l [diL - d 0 d 2 L - (di d 2 L)Q\ , (1.128) 

where Q = I 2 is the velocity component selector. 

That initial segment of the local tuple that specifies the state 
is called the local state tuple, or, more simply, the state tuple. 

We can express the function that gives the acceleration as a 
function of the state tuple as the following procedure. It takes 
a procedure that computes the Lagrangian, and returns a pro- 
cedure that takes a state tuple as its argument and returns the 
acceleration. 75 

(define (Lagrangian->acceleration L) 

(let ((P ((partial 2) L)) 

(F ((partial 1) L))) 

(/ (- F 

(+ ((partial 0) P) 

(* ((partial 1) P) velocity))) 

((partial 2) P)))) 

Once we have a way of computing the acceleration from the 
coordinates and the velocities, we can give a prescription for com- 
puting the derivative of the state as a function of the state. For 



75 In Scmutils division by a matrix is interpreted as multiplication on the left 
by the inverse matrix. 
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the state (t, q(t),Dq(t)) at the moment t the derivative of the state 
is (1 , Dq(t), D 2 q(t)) = (l,Dq(t),A(t,q(t),Dq(t))). The procedure 
Lagrangian->state-derivative takes a Lagrangian and returns 
a procedure that takes a state and returns the derivative of the 
state: 

(define (Lagrangian->state-derivative L) 

(let ((acceleration (Lagrangian->acceleration L))) 

(lambda (state) 

(up 1 

(velocity state) 

(acceleration state))))) 

We represent a state by an up-tuple of the components of that 
initial segment of the local tuple that determine the state. 

For example, the parametric state derivative for a harmonic 
oscillator is 

(define (harmonic-state-derivative m k) 

(Lagrangian->state-derivative (L-harmonic m k))) 

(print-expression 

((harmonic-state-derivative ’m ’k) 

(up ’t (up ’x ’y) (up ’v_x ’v_y)))) 

(up 1 (up v_x v_y) (up (/ (* -1 k x) m) (/(* -1 k y) m) ) ) 

The Lagrange equations are second-order system of differential 
equations that constrain realizable paths q. We can use the state 
derivative to express the Lagrange equations as a first-order sys- 
tem of differential equations that constrain realizable coordinate 
paths q and velocity paths v: 

(define ( (Lagrange-equations-f irst-order L) q v) 

(let ((state-path (qv->state-path q v))) 

(- (D state-path) 

(compose (Lagrangian->state-derivative L) 
state-path) ) ) ) 

(define ( (qv->state-path q v) t) 

(up t (q t) (v t))) 

For example, we can find the first-order form of the equations of 
motion of a two-dimensional harmonic oscillator: 




1.7 Evolution of Dynamical State 



71 



(show-expression 

( ( (Lagrange-equations-f irst-order (L-harmonic ’m ’k)) 
(up (literal-function ’x) 

(literal-function ’y)) 

(up (literal-function ’v_x) 

(literal-function ’v_y))) 

’t)) 



/ o \ 

/ Dx ( t ) - v x ( t ) \ 



V Dy (t) - v y (t ) , 

( k ^ + Dv I (t) 

m 

{m +DvAt)/J 



The zero in first element of the structure of the Lagrange equa- 
tions residuals is just the tautology that time advances uniformly: 
that the time function is just the identity, so its derivative is 1 
and the residual is zero. The equations in the second element 
constrain the velocity path to be the derivative of the coordinate 
path. The equations in the third element give the rate of change 
of the velocity in terms of the applied forces. 



Numerical integration 

A set of first order ordinary differential equations that give the 
state derivative in terms of the state can be integrated to find the 
state path that emanates from a given initial state. Numerical 
integrators find approximate solutions of such differential equa- 
tions by a process illustrated in figure 1.6. The state derivative 
produced by Lagrangian->state-derivative can be used by a 
package that numerically integrates systems of first-order ordinary 
differential equations. 

The procedure state-advancer can be used to find the state of 
a system at a specified time, given an initial state, which includes 
the initial time, and a parametric state-derivative procedure. 76 



76 The Scmutils system provides a stable of numerical integration routines 
that can be accessed through this interface. These include quality-controlled 
Runge-Kutta (QCRK4) and Bulirsch-Stoer. The default integration method 
is Bulirsch-Stoer. 
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Figure 1.6 The input to the system derivative is the state. The func- 
tion A gives the acceleration as a function of the components that de- 
termine the state. The output of the system derivative is the derivative 
of the state. The integrator takes the derivative of the state as its in- 
put and produces the integrated state, starting at the initial conditions. 
Notice how the second-order system is put into first-order form by the 
routing of the Dq(t) components in the system derivative. 



For example, to advance the state of a two-dimensional harmonic 
oscillator we write 77 

(print-expression 

((state-advancer harmonic-state-derivative 2. 1.) 

(up 0 . (up 1 . 2 . ) (up 3 . 4 . ) ) 

10 

l.e-12) 

(up 10. 

(up 3.712791664584467 5 . 420620823651575) 

(up 1.6148030925459906 1 . 8189103724750977) ) 

The arguments to state-advancer are a parametric state deriva- 
tive, harmonic-state-derivative, and the state-derivative pa- 



77 The procedure state-advancer automatically compiles state-derivative pro- 
cedures the first time they are encountered. The first time a new state- 
derivative is used there is a delay while compilation occurs. 
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rameters (mass 2. and spring constant 1.). A procedure is re- 
turned that takes an initial state, (up 0 . (up 1 . 2 . ) (up 3 . 
4.)), a target time, 10, and a relative error tolerance, l.e-12. 
The output is an approximation to the state at the specified final 
time. 

Consider the driven pendulum, described above, with a periodic 
drive. We choose y s (t) = a cos cot. 

(define ((periodic-drive amplitude frequency phase) t) 

(* amplitude (cos (+ (* frequency t) phase)))) 

(define (L-periodically-driven-pendulum m 1 g a omega) 

(let ((ys (periodic-drive a omega 0))) 

(L-pend m 1 g ys))) 

Lagrange’s equation for this system is: 

(show-expression 
( ( (Lagrange-equations 

(L-periodically-driven-pendulum ’m ’1 ’g ’a ’omega)) 
(literal-function ’theta)) 

’t)) 



D 2 9 ( t ) l 2 m . — cos (cut) sin (6 (t)) almw 2 + sin (9 (t)) glm 



The parametric state derivative for the periodically driven pendu- 
lum is 

(define (pend-state-derivative m 1 g a omega) 
(Lagrangian->state-derivative 

(L-periodically-driven-pendulum m 1 g a omega) ) ) 

(show-expression 

((pend-state-derivative ’m ’1 ’g ’a ’omega) 

(up ’t ’theta ’thetadot))) 



1 ) 

0 



I acu 2 cos (cot.) sin (9) g sin (9) I 
\ l l / 
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To examine the evolution of the driven pendulum we need a 
mechanism that evolves a system for some interval while moni- 
toring aspects of the system as it evolves. The procedure evolve 
provides this service, using state-advancer repeatedly to advance 
the state to the required moments. The procedure evolve takes a 
parametric state-derivative and its parameters and returns a pro- 
cedure that evolves the system from a specified initial state to a 
number of other times, monitoring some aspect of the state at 
those times. To generate a plot of the angle versus time we make 
a monitor procedure that generates the plot as the evolution pro- 
ceeds: 78 



(define ((monitor-theta win) state) 

(let ((theta ((principal-value :pi) (coordinate state)))) 
(plot-point win (time state) theta))) 

(define plot-win (frame 0. 100. :-pi :pi)) 



((evolve pend-state-derivative 

1.0 

1.0 
9.8 
0.1 

(* 2.0 (sqrt 9.8)) ) 

(up 0 . 0 

1 . 

0 .) 

(monitor-theta plot-win) 

0.01 

100.0 

1.0e-13) 



;m=lkg 
; l=lm 
;g=9.8m/s 2 
;a=l/10 m 
; omega 
;t 0 =0 

;thetao=l radian 
;thetadoto=0 radians/s 

;step between plotted points 
; final time 

; local error tolerance 



Figure 1.7 shows the angle 6 versus time for a couple of orbits for 
the driven pendulum. The initial conditions for the two runs are 
the same except that in one the bob is given a tiny velocity equal to 
10~ 10 m/s, about one atom width per second. The initial segments 



78 The results are plotted in a plot-window that is created by the procedure 
frame with arguments xmin, xmax, ymin, ymin, that specify the limits of the 
plotting area. Points are added to the plot with the procedure plot -point 
that takes a plot-window and the abscissa and ordinate of the point to be 
plotted. 

The procedure principal-value is used to reduce an angle to a standard 
interval. The argument to principal-value is the point at which the circle is 
to be cut. Thus (principal-value :pi) is a procedure that reduces an angle 
6 to the interval — n < 6 < n. 
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Figure 1.7 Orbits of the driven pendulum. The angle 9 is plotted 
against time. Because angles are periodic, this plot may be thought 
of as wound around a cylinder. The upper plot shows the results of a 
simulation with initial conditions 9 = 1 and 9 = 0. The orbit oscillates 
for a while, then circulates, then resumes oscillating. In the lower plot 
we show the result for a slightly different initial angular velocity, 9 = 
IQ-io ^he behavior is indistinguishable from the top figure, but 

the two trajectories become uncorrelated after the transition between 
oscillation and circulation. This extreme sensitivity to initial conditions 
is characteristic of systems with chaotic behavior. 
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of the two orbits are indistinguishable. After about 75 seconds the 
two orbits diverge and become completely different. This extreme 
sensitivity to tiny changes in initial conditions is characteristic of 
what is called chaotic behavior. Later, we will investigate this 
example further, using other tools such as Lyapunov exponents, 
phase space, and Poincare sections. 



1.8 Conserved Quantities 

A quantity that is a function of the state of the system that is 
constant along a solution path is called a conserved quantity or a 
constant of motion. If C is a conserved quantity, then 

D(CoT[q\) = D t CoT[q\ = 0 (1.129) 

for solution paths q. Following historical practice we also refer 
to constants of the motion as integrals of the motion. 79 In this 
section, we will investigate systems with symmetry and find that 
symmetries are associated with conserved quantities. For instance, 
linear momentum is conserved in a system with translational sym- 
metry, angular momentum is conserved if there is rotational sym- 
metry, energy is conserved if the system does not depend on the 
origin of time. We first consider systems for which a coordinate 
system can be chosen that naturally expresses the symmetry, and 
later discuss systems for which no coordinate system can be chosen 
that simultaneously expresses all symmetries. 

1.8.1 Conserved Momenta 

If a Lagrangian L(t, q, v) does not depend on some particular co- 
ordinate q l , then 

(di L)i = 0, (1.130) 

and the corresponding ith component of the Lagrange equations 
is 

(D(d 2 LoT[q\)) i = 0. (1.131) 



79 In the older literature conserved quantities are sometimes called first inte- 
grals. 
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This is the same as 80 




D((d 2 L) i oT[q})) = 0. 


(1.132) 


So we see that 




Vi = ( d 2 L)i 


(1.133) 



is a conserved quantity. The function V is called the momen- 
tum state function. The value of the momentum state function 
is the generalized momentum. We refer to ith component of the 
generalized momentum as the momentum conjugate to the ith co- 
ordinate. 81 A generalized coordinate component that does not 
appear explicitly in the Lagrangian is called a cyclic coordinate. 
The generalized momentum component conjugate to any cyclic 
coordinate is a constant of the motion. Its value is constant along 
realizable paths; it may have different values on different paths. 
As we will see, momentum is an important quantity even when it 
is not conserved. 

Given the coordinate path q and the Lagrangian L, the momen- 
tum path p is 

p = d 2 LoT[q}=VoT[q}, (1.134) 

with components 

Pi = VioT[q\. (1.135) 

The momentum path is well defined for any path q. If the path is 

realizable and the Lagrangian does not depend on q l then pi is a 
constant function 

Dpi = 0. (1.136) 

The constant value of pi may be different for different trajectories. 



80 The derivative of a component is equal to the component of the derivative. 

81 Observe that we indicate a component of the generalized momentum with 
a subscript, and indicate a component of the generalized coordinates with a 
superscript. These conventions are consistent with the ones that are commonly 
used in tensor algebra, which is sometimes helpful in working out complex 
problems. 
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Examples of conserved momenta 

The free particle Lagrangian L(t, x, v) = ^mv 2 is independent of 
x. So the momentum state function, V(t, q, v ) = mv , is conserved 
along realizable paths. The momentum path p for the coordinate 
path q is p(t) = V o r[g](t) = mDq(t). For a realizable path 
Dp(t) = 0. For the free particle the usual linear momentum is 
conserved for realizable paths. 

For a particle in a central force field (section 1.6), the La- 
grangian 

L(t; r, p\ r, p) = \m{f 2 + r 2 p 2 ) — V (r) 

depends on r but is independent of p. The momentum state- 
function is 

V(t-,r,p;r,<p) = [mf,mr 2 p] . 

It has two components. The first component, “the radial mo- 
mentum,” is not conserved. The second component, “the angular 
momentum,” is conserved along any solution trajectory. 

If the central potential problem had been expressed in rectan- 
gular coordinates, then all of the coordinates would have appeared 
in the Lagrangian. In that case there would not be any obvious 
conserved quantities. Nevertheless, the motion of the system does 
not depend on the choice of coordinates; so the angular momen- 
tum is still conserved. 

We see that there is great advantage in making a judicious 
choice for the coordinate system. If we can choose the coordinates 
so that a symmetry of the system is reflected in the Lagrangian 
by the absence of some coordinate component, then the existence 
of a corresponding conserved quantity will be automatic. 82 

1.8.2 Energy Conservation 

Momenta are conserved by the motion if the Lagrangian does not 
depend on the corresponding coordinate. There is another con- 



82 In general, conserved quantities in a physical system are associated with 
continuous symmetries, whether or not one can find a coordinate system in 
which the symmetry is apparent. This powerful notion was formalized and a 
theorem linking conservation laws with symmetries was proved by E. Noether 
early in the 20th century. See section 1.8.4 on Noether’s theorem. 
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stant of the motion, the energy, if the Lagrangian L(t. q. q) does 
not depend explicitly on the time: 8qL = 0. 

Consider the time derivative of the Lagrangian along a solution 
path q: 

D(LoT[q\) = d 0 LoT[q} + {d 1 LoT[q})Dq+{8 2 LoT[q})D 2 q.{1.137) 

Using Lagrange’s equations to rewrite the second term 

D(LoT[q\) = {d 0 L)oT[q}+D{8 2 LoT[q})Dq+{d 2 LoT[q])D 2 q.{1.138) 

Isolating do L and combining the first two terms on the right side 

(do L) o T[q] = D(L o T[g]) - D((8 2 L o T[q\)Dq) 

= D{L o T[q}) - D((8 2 L o T[q])(Q o T[q})) 

= D((L-PQ)oT[q]), (1.139) 

where, as before, Q selects the velocity from the state. So we see 
that if 8qL = 0 then 

£ = VQ-L, (1.140) 

is a conserved along realizable paths. The function £ is called 
the energy state function. 83 Let E = £ o T[q] denote the energy 
function on the path q. The energy function has a constant value 
along any realizable trajectory if the Lagrangian has no explicit 
time dependence; the energy E may have a different value for dif- 
ferent trajectories. A system that has no explicit time dependence 
is called autonomous. 

Given a Lagrangian L, we may compute the energy: 

(define (Lagrangian->energy L) 

(let ((P ((partial 2) L))) 

(- (* P velocity) L))) 

Energy in terms of kinetic and potential energies 

In some cases the energy can be written as the sum of kinetic and 
potential energies. Suppose the system is composed of particles 
with rectangular coordinates x a , the movement of which may be 
subject to constraints, and that these rectangular coordinates are 
some functions of the generalized coordinates q and possibly time 



83 The sign of the energy state function is a matter of convention. 
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t: x a = f a (t,q). We form the Lagrangian as L = T — V and 
compute the kinetic energy in terms of q by writing the rectangular 
velocities in terms of the generalized velocities: 

v a = d 0 f a (t,q) + dif a (t,q)v. (1.141) 

The kinetic energy is 

T{t, q,v) = \ Y, a m a vl, (1.142) 

where v a is the magnitude of v Q . 

If the f a functions do not depend explicitly on time (<% f a = 0), 
then the rectangular velocities are homogeneous functions of the 
generalized velocities of degree 1, and T is a homogeneous function 
of the generalized velocities of degree 2, because it is formed by 
summing the square of homogeneous functions of degree 1 . If T is 
a homogeneous function of degree 2 in the generalized velocities 
then 

VQ = ( d 2 T)Q = 2 T, (1.143) 

where the second equality follows from Euler’s theorem on homo- 
geneous functions. 84 The energy state function is 

£ = VQ - L = VQ -T + V. (1.144) 

So if f a is independent of time, the energy function can be rewrit- 
ten 

£ = 2T-T+V = T+V. (1.145) 

Notice that if V depends on time the energy is still the sum of 
the kinetic energy and potential energy, but the energy is not 
conserved. 

The energy state function is always a well defined function, 
whether or not it can be written in the form of T + V, and whether 
or not it is conserved along realizable paths. 



84 Euler’s theorem says that if / is a function of x = (xo,xi , . . .) that is homo- 
geneous of degree n in each of the Xi, then 

/ d f 
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Exercise 1.28: 

An analogous result holds when the f a do depend explicitly on time. 

a. Show that in this case the kinetic energy contains terms that are 
linear in the generalized velocities. 

b. Show that, by adding a total time derivative, the Lagrangian can 
be written in the form L = A — B, where A is a homogeneous quadratic 
form in the generalized velocities, and B is velocity independent. 

c. Show, using Euler’s theorem, that the energy function is £ = A + B. 

An example where terms that were linear in the velocity were removed 
from the Lagrangian by adding a total time derivative has already been 
given: the driven pendulum. 

Exercise 1.29: 

A particle of mass m slides off a horizontal cylinder of radius R in a 
uniform gravitational field with acceleration g. If the particle starts 
close to the top with zero initial speed, with what angular velocity does 
the particle leave the cylinder? 

1.8.3 Central Forces in Three Dimensions 

One important physical system is the motion of a particle in a cen- 
tral field in three dimensions, with an arbitrary potential energy 
V(r) depending only on the radius. We will describe this system 
in spherical coordinates r, 9, and ip, where 9 is the colatitude and 
ip is the longitude. The kinetic energy has three terms: 

T(t\ r, 9, ip ; f, 9, <p) = \m(r 2 + r 2 6 2 + r 2 (sin 9) 2 ip 2 ). 

As a procedure: 

(define ( (T3-spherical m) state) 

(let ((t (time state)) 

(q (coordinate state)) 

(qdot (velocity state))) 

(let ((r (ref q 0)) 

(theta (ref q 1)) 

(phi (ref q 2)) 

(rdot (ref qdot 0)) 

(thetadot (ref qdot 1)) 

(phidot (ref qdot 2))) 

(* 1/2 m 

(+ (square rdot) 

(square (* r thetadot)) 

(square (* r (sin theta) phidot))))))) 
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The Lagrangian is then formed by subtracting the potential en- 
ergy: 

(define (L3-central m Vr) 

(define (Vs state) 

(let ((r (ref (coordinate state) 0))) 

(Vr r))) 

(- (T3-spherical m) Vs)) 

Let’s first look at the generalized forces (the derivatives of the La- 
grangian with respect to the generalized coordinates). We com- 
pute these with a partial derivative with respect to the coordinate 
argument of the Lagrangian: 

(show-expression 

(((partial 1) (L3-central ’m (literal-function ’V))) 

(up ’t 

(up ’r ’theta ’phi) 

(up ’rdot ’thetadot ’phidot)))) 




The (p component of the force is zero because does not appear 
in the Lagrangian (it is a cyclic variable). The corresponding 
momentum component is conserved. Compute the momenta: 



(show-expression 

(((partial 2) (L3-central ’m (literal-function ’V))) 
(up ’t 

(up ’r ’theta ’phi) 

(up ’rdot ’thetadot ’phidot)))) 




The momentum conjugate to is conserved. This is the z com- 
ponent of the angular momentum r x (mv ) , for vector position 
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r and linear momentum mv. We can show this by writing the 2 : 
component of the angular momentum in spherical coordinates: 

(define ((ang-mom-z m) state) 

(let ((q (coordinate state)) 

(v (velocity state))) 

(ref (cross-product q (* m v) ) 2))) 

(define (s->r state) 

(let ((q (coordinate state))) 

(let ((r (ref q 0)) 

(theta (ref q 1)) 

(phi (ref q 2))) 

(let ((x (* r (sin theta) (cos phi))) 

(y (* r (sin theta) (sin phi))) 

(z (* r (cos theta)))) 

(up x y z)))))) 

(show-expression 

((compose (ang-mom-z ’m) (F->C s->r)) 

(up ’t 

(up ’r ’theta ’phi) 

(up ’rdot ’thetadot ’phidot)))) 



mr 2 ip (sin ( 6 )) 2 



The choice of the z-axis is arbitrary, so the conservation of any 
component of the angular momentum implies the conservation of 
all components. Thus the total angular momentum is conserved. 
We can choose the z axis so all of the angular momentum is in the 
z component. The angular momentum must be perpendicular to 
both the radius vector and to the linear momentum vector. Thus 
the motion is planar, 6 = 7t/2, and 6 = 0. Planar motion in a 
central- force field was discussed in section 1.6. 

We can also see that the energy state function computed from 
the Lagrangian for a central field is in fact T + V: 

(show-expression 

( (Lagrangian->energy (L3-central ’m (literal-function ’V))) 
(up ’t 

(up ’r ’theta ’phi) 

(up ’rdot ’thetadot ’phidot)))) 
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-mip 2 r 2 (sin (#)) 2 + -■ mr 2 0 2 + -mr 2 + V (r) 



The energy is conserved because the Lagrangian has no explicit 
time dependence. 

Exercise 1.30: Driven spherical pendulum 

A spherical pendulum is a massive bob, subject to uniform gravity, that 
may swing in three dimensions, but remains at a given distance from 
the pivot. Formulate a Lagrangian for a spherical pendulum, driven 
by vertical motion of the pivot. What symmetry(ies) can you find? 
Find coordinates that express the symmetry. What is conserved? Give 
analytic expression(s) for the conserved quantity (ies). 

1.8.4 Noether’s Theorem 

We have seen that if a system has a symmetry and if a coordinate 
system can be chosen so that the Lagrangian does not depend 
on the coordinate associated with the symmetry then there is a 
conserved quantity associated with the symmetry. However, there 
are more general symmetries for which there is no coordinate sys- 
tem that fully expresses the symmetry. For example, motion in a 
central potential is spherically symmetric, the dynamical system 
is invariant under rotations about any axis, but the expression of 
the Lagrangian for the system in spherical coordinates only ex- 
hibits symmetry around one axis. More generally, a Lagrangian 
has a symmetry if there is a coordinate transformation that leaves 
the Lagrangian unchanged. A continuous symmetry is a paramet- 
ric family of symmetries. Here we show that for any continuous 
symmetry there is a conserved quantity. 

Consider a parametric coordinate transformation F with pa- 
rameter s: 85 

x = F(s)(t,x'). (1.146) 

To this parametric coordinate transformation there corresponds a 
parametric state transformation C : 

(t, x, v) = C(s)(t, x' , v'). (1.147) 



85 Noether’s theorem is more general than we state and prove it here. We 
assume the transformations F(s) have no dependence on the generalized ve- 
locities. Properly, we should also consider velocity dependent symmetries. 
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We require that the transformation F( 0) is the identity coordinate 
transformation x' = F(0)(t, a/); and as a consequence (7(0) is 
the identity state transformation ( t,x',v ') = C(0)(t,x',v'). The 
Lagrangian L has a continuous symmetry corresponding to F if it 
is invariant under the transformations 

L(s) = LoC(s) = L (1.148) 

for any s. The Lagrangian L is the same function as the trans- 
formed Lagrangian L(s). 

That L(s) = L for any s implies DL(s ) = 0. Explicitly, L(s) is 

L(s)(t,x',v') = L(t,F(s)(t,x'),D t (F(s))(t,x , ,v')), (1.149) 

where we have rewritten the velocity component of C(s) in terms 
of the total time derivative. The derivative of L is zero: 

0 = DL{s){t , x ' , v') 

= d\ L(t, x, v)(DF)(s)(t, x') + d 2 L(t, x, v)D t (DF(s))(t, x), 

(1.150) 

where we have used the fact that 86 

D t {DF(s)) = DG(s) with G(s) = D t (F(s)). (1.151) 

On a realizable path q we can use the Lagrange equations to 
rewrite the first term 

0 = (D t d 2 LoT[q})((DF)(s)oT[q'}) 

+ ( d 2 L o T[q])(D t (DF(s)) o T [(/]). (1.152) 

For s = 0 the paths q and ({ are the same, so L [q] = r[g'], and 
this equation becomes 

0 = ((D t d 2 L)((DF)( 0)) + (d 2 L)(D t (DF( 0)))) o T[q\ 



86 The total time derivative is like a derivative with respect to a real-number 
argument in that it does not generate structure, so it can commute with 
derivatives that generate structure. Be careful though, it may not commute 
with some derivatives for other reasons. For example, Dtdi(F(s)) is the same 
as diDt(F(s)), but D t d 2 (F(s)) is not the same as d 2 D t (F(s)). The reason is 
that F(s) does not depend on the velocity, but Dt(F(s)) does. 
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= D t ((d 2 L)(DF(0)))oT[q}. (1.153) 

Thus the state function X, 

I = (d 2 L)(DF( 0)), (1.154) 

is conserved along solution trajectories. This is Noether’s inte- 
gral. The integral is the product of the momentum and a vector 
associated with the symmetry. 

Illustration: motion in a central potential 

For example, consider the central potential Lagrangian in rectan- 
gular coordinates: 



L(t;x,y,z;v x .v y ,v z ) 

= \m [v 2 + Vy + vf) - U + y 2 + z 2 ^j , 
and a parametric rotation R z (s ) about the z axis 



' x 1 cos s — y' sin s ' 



y = R z (s) ]/ = x’ sin s + y' cos s 



(1.155) 



(1.156) 



The rotation is an orthogonal transformation so 

x 2 + y 2 + z 2 = {x'f + {y') 2 + {z') 2 . (1.157) 

Differentiating along a path, we get 

(v x ,v y ,v z ) = R z (s)(v' x ,v' y ,v' z ), (1.158) 

so the velocities also transform by an orthogonal transformation 

5 + ^ = K) 2 + Wy 

L'{t\x' ,y' ,z!\v' x ,v' y ,v' z ) 



and v 2 x + vl + v 2 z = (i v' x ) 2 + ( v ') 2 + ( v' z ) 2 . Thus 



= + i v 'y ) 2 + Wzf) 

- u {V(x ’) 2 + (y ') 2 + ( Z ') 2 ) , 



(1.159) 



and we see that L' is precisely the same function as L. 
The momenta are 



d 2 L(t; x, y, z; v x , v y , v z ) = [mv x , mv y ,mv z \ . 



(1.160) 
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and 

DF(0)(t\ x , y, z) = DR z (0)(x, y, z) = (y, -x, 0) . (1.161) 

So the Noether integral is 

l(t;x,y,z-,v x ,v y ,v z ) = ((d 2 L)(DF(0)))(t-, x,y, z\v x ,v y ,v z ) 

= m(yv x -xvy), (1.162) 

which we recognize as minus the z component of the angular mo- 
mentum: x x (mv). Since the Lagrangian is preserved by any 
continuous rotational symmetry, all components of the vector an- 
gular momenta are conserved for the central potential problem. 

The procedures calls ( (Rx angle-x) q) , ( (Ry angle-y) q) , 
and ((Rz angle-z) q) rotate the rectangular tuple q about the 
indicated axis by the indicated angle. 87 We use these to make a 
parametric coordinate transformation F-tilde: 

(define (F-tilde angle-x angle-y angle-z) 

(compose (Rx angle-x) (Ry angle-y) (Rz angle-z) coordinate)) 

A Lagrangian for motion in a central potential is: 

(define ( (L-central-rectangular m U) tqp) 

(let ((q (coordinate state)) 

(v (velocity state))) 

(- (* 1/2 m (square v) ) (U (sqrt (square q) ) ) ) ) ) 

The Noether integral is then 



87 The definition of the procedure Rx is 

(define ( (Rx angle) q) 

(let ((ca (cos angle)) (sa (sin angle))) 

(let ((x (ref q 0)) (y (ref q 1)) (z (ref q 2))) 
(up x 

(- (* ca y) (* sa z)) 

(+ (* sa y) (* ca z)))))) 

The definitions of Ry and Rz are similar. 
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(define Noether-integral 

(let ((L (L-central-rectangular 

’m (literal-function ’U)))) 

(* ((partial 2) L) ((D F-tilde) 0 0 0)))) 

(print-expression 
(Noether-integral 
(up ’t 

(up ’x ’y ’z) 

(up ’vx ’vy ’vz)))) 

(down (+ (* m vy z) (* -1 m vz y) ) 

(+ (* m vz x) (* -1 m vx z) ) 

(+ (* m vx y) (* -1 m vy x) ) ) 

We get all three components of the angular momentum. 



1.9 Abstraction of Path Functions 

An essential step in the derivation of the local-tuple transforma- 
tion function C from the coordinate transformation F was the 
deduction of the relationship between the velocities in the two 
coordinate systems. We did this by inserting coordinate paths 
into the coordinate transformation function F, differentiating, and 
then generalizing the results on the path to arbitrary velocities at 
a moment. The last step is an example of a more general problem 
of abstracting a local-tuple function from a path function. Given a 
function / of a local tuple a corresponding path-dependent func- 
tion f[q] is f[q] = f o r[g], Given /, how can we reconstitute 
/? The local-tuple function / depends on only a finite number of 
components of the local tuple, and / only depends on the corre- 
sponding local components of the path. So / has the same value 
for all paths that have that number of components of the local 
tuple in common. Given / we can reconstitute / by taking the 
argument of /, which is a finite initial segment of a local tuple, 
constructing a path that has this local description, and finding 
the value of / for this path. 

Two paths that have the same local description up to the rath 
derivative are said to osculate with order n contact. For example, 
a path and the truncated power series representation of the path 
up to order n have order n contact; if fewer than ra derivatives 
are needed by a local-tuple function, the path and the truncated 
power series representation are equivalent. Let O be a function 
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that generates an osculating path with the given local tuple com- 
ponents. So Oft, q,v, . . .)(f) = q, D(0(t , q, v, . . . ))(t ) = v, and in 
general 

(t, q,v,...) = r [0(t, q,v,.. .)](*)• (1.163) 

The number of components of the local tuple that are required is 
finite, but unspecified. One way of constructing O is through the 
truncated power series 

Oft, q, v,a,.. = q + v(t' - t) + \aft' - t) 2 4 , (1.164) 

where the number of terms is the same as the number of compo- 
nents of the local tuple that are specified. 

Given the path function / we reconstitute the / function as 
follows. We take the argument of / and construct an osculating 
path with this local description. Then the value of / is the value 
of / for this osculating path: 

f(t, q, V, •••) = / ° r [0(t, q,v,.. .)](t) = f[0(t, q,v,.. .)](f). (1.165) 

Let T be the function that takes a path function and returns 
the corresponding local-tuple function: 

/ = f(/). (1.166) 

From equation (1.165) we see that 

r if){t,q,v,...) = f[0(t,q,v,...)](t). (1.167) 

The procedure Gamma-bar implements the function T that re- 
constitutes a path-dependent function into a local-tuple function: 

(define ((Gamma-bar f-bar) local) 

((f-bar (osculating-path local)) (time local))) 

The procedure osculating-path takes a number of local compo- 
nents and returns a path with these components; it is implemented 
as a power series. 

We can use Gamma-bar to construct the procedure F->C that 
takes a coordinate transformation F and generates the procedure 
that transforms local tuples. The procedure F->C constructs a 
path-dependent procedure f-bar that takes a coordinate path in 
the primed system and returns the local tuple of the corresponding 
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path in the unprimed coordinate system. It then uses Gamma-bar 
to abstract f-bar to arbitrary local tuples in the primed coordi- 
nate system. 

(define (F->C F) 

(define (f-bar q-prime) 

(define q 

(compose F (Gamma q-prime))) 

(Gamma q) ) 

(Gamma-bar f-bar)) 



(show-expression 
( (F->C p->r) 

(->local ’ t (up ’r ’theta) (up ’rdot ’thetadot) ) ) ) 




Notice that in this definition of F->C we do not explicitly calculate 
any derivatives. The calculation that led up to the state transfor- 
mation (1.74) is not needed. 

We can also use F to make an elegant formula for computing 
the total time derivative DfF of the function F: 

D t F = T{G), with G[q\ = D{FoT[q}). (1.168) 

The implementation of the total time derivative as a program 
follows this definition. Given a procedure F implementing a local- 
tuple function and a path q we can construct a new procedure 
(compose F (Gamma q)). The procedure G-bar implements the 
derivative of this function of time. We then abstract this off the 
path with Gamma-bar to give the total time derivative. 

(define (Dt F) 

(define (G-bar q) 

(D (compose F (Gamma q) ) ) ) 

(Gamma-bar G-bar) ) 
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Exercise 1.31: Velocity transformation 

Use the procedure Gamma-bar to construct a procedure that transforms 
velocities given a coordinate transformation. Apply this procedure to 
the procedure p->r to deduce (again) equation (1.65). 

Exercise 1.32: Path functions and state functions 

The local-tuple function / is the same as the local-tuple function f (/) 
where f[q] = foT[q\. On the other hand, the path function /[g], and the 
path function f(/) oT[g], are not necessarily the same. Explain. Give 
examples where they are the same and where they are not the same. 
Write programs to illustrate the behavior. 

Lagrange equations at a moment 

Given a Lagrangian, the Lagrange equations test paths for whether 
they are realizable paths of the system. The Lagrange equations 
relate the path and its derivatives. The fact that the Lagrange 
equations must be satisfied at each moment suggests that we can 
abstract the Lagrange equations off the path and write them as 
relations among the local-tuple components of realizable paths. 

Let E[L] be the path-dependent function that produces the 
residuals of the Lagrange equations (1.18) for the Lagrangian L: 

E [L][q] = D(d 2 LoT[q])-d 1 LoT[q}. (1.169) 

Realizable paths q satisfy the Lagrange equations 

E[L][g] = 0. (1.170) 

The path-dependent Lagrange equations can be converted to local 

Lagrange equations using T 

E[L] = f(E[L]). (1.171) 

The operator E is called the Euler-Lagrange operator. In terms of 
this operator the Lagrange equations are 

E[L\oT[q]=0. 

Applying the definition (1.167) of T 

E [L](t, q,v,...) = f(E[L])(t, q,v ,.. .) 

= D(d 2 LoT[0(t,q,v , ...)]) 

- diLoT[0(t,q,v,...)] 



(1.172) 
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= (. D t (d 2 L))(t , q,v,...)~ di L(t, q,v,.. .) 

= {D t d 2 L-d 1 L)(t 1 q,v,...). (1.173) 

So the Euler-Lagrange operator is explicitly 

E[L\ = D t d 2 L - diL. (1.174) 

The procedure Euler-Lagrange-operator implements E 

(define (Euler-Lagrange-operator L) 

(- (Dt ((partial 2) L)) ((partial 1) L) ) ) . 

For example, applied to the Lagrangian for the harmonic oscil- 
lator, 

(print-expression 
( (Euler-Lagrange-operator 
(L-harmonic ’m ’k)) 

(->local ’t ’x ’v ’a))) 

(+ (* am) (* k x) ) 

Notice that the components of the local tuple are individually 
specified. Using equation (1.172), the Lagrange equations for the 
harmonic oscillator are: 88 

(print-expression 
( (compose 

(Euler-Lagrange-operator (L-harmonic ’m ’ k) ) 

(Gamma (literal-function ’x) 4)) 

’t)) 

(+ (* k (x t)) (* m ( ( (expt D 2) x) t) ) ) 

Exercise 1.33: Properties of E 

Let F and G be two Lagrangian-like functions of a local tuple, C be a 
local-tuple transformation function, and c a constant. Demonstrate the 
following properties: 

a. E[F + G] = E[F] + E[G] 

b. E[cF] = cE[F] 

c. E[FG] = E[F]G + FE[G\ + ( D t F)d 2 G + d 2 F(D t G) 

d. E[F oC] = D t (DF o C)d 2 C + DF o CE[C\ 



88 Notice that Gamma has one more argument than it usually has. This argument 
gives the length of the initial segment of the local tuple needed. The default 
length is 3, giving components of the local tuple up to and including the 
velocities. 
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1.10 Constrained Motion 

An advantage of the Lagrangian approach is that the coordinates 
can often be chosen to exactly describe the freedom of the sys- 
tem, automatically incorporating any constraints. We may also 
use coordinates that have more freedom than the system actu- 
ally has and consider explicit constraints among the coordinates. 
For example, the planar pendulum has a one-dinrensional config- 
uration space. We have formulated this problem using the angle 
from the vertical as the configuration coordinate. Alternatively, 
we may choose to represent the pendulum as a body moving in 
the plane, constrained to be on the circle of the correct radius 
around the pivot. We would like to have valid descriptions for 
both choices and show they are equivalent. In this section we 
develop tools to handle problems with explicit constraints. The 
constraints considered here are more general than those consid- 
ered in the demonstration that the Lagrangian for systems with 
rigid constraints can be written as the difference of kinetic and 
potential energies (see section 1.6.2). 

Suppose the configuration of a system with n degrees of freedom 
is specified by n + 1 coordinates and that configuration paths q 
are constrained to satisfy some relation of the form 

‘P(t,q{t),Dq(t)) = 0. (1.175) 

How do we formulate the equations of motion? One approach 
would be to use the constraint equation to eliminate one of the 
coordinates in favor of the rest, and then the evolution of the 
reduced set of generalized coordinates would be described by the 
usual Lagrange equations. The equations governing the evolution 
of coordinates that are not fully independent should be equivalent. 

We can address the problem of formulating equations of mo- 
tion for systems with redundant coordinates by returning to the 
action principle. Realizable paths are distinguished from other 
paths by having stationary action. Stationary refers to the fact 
that the action does not change with certain small variations of 
the path. What variations should be considered? We have seen 
that velocity-independent rigid constraints can be used to elim- 
inate redundant coordinates. In the irredundant coordinates we 
distinguished realizable paths using variations that by construc- 
tion satisfy the constraints. Thus in the case where constraints 
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can be used to eliminate redundant coordinates we can restrict 
the variations in the path to those that are consistent with the 
constraints. 

So how does the restriction of the possible variations affect the 
argument that led to Lagrange’s equations (refer to section 1.5)? 
Actually most of the calculation is unaffected. The condition that 
the action is stationary still reduces to the condition (1.34): 

0= I' 2 {(d 1 LoT[q])-D(d 2 LoT[q})}7 1 . (1.176) 

Jti 

At this point we argued that because the variations ij are arbitrary 
(except for conditions at the endpoints), the only way for the 
integral to be zero is for the integrand to be zero. Furthermore, 
the freedom in our choice of r/ allowed us to deduce that the factor 
multiplying 77 in the integrand must be identically zero, thereby 
deriving Lagrange’s equations. 

Now the choice of 77 is not completely free. We may still deduce 
from the arbitrariness of 77 that the integrand must be zero , 89 but 
we may no longer deduce that the factor multiplying r] is zero 
(only that the projection of this factor onto acceptable variations 
is zero). So we have 

{ {d x L O T[q)) - D ( d 2 L O T[q})} 77 = 0, (1.177) 

with r] subject to the constraints. 

A path q satisfies the constraint if <p\q\ = ip o T[g] = 0. The 
constraint must be satisfied even for the varied path, so we only 
allow variations 77 for which the variation of the constraint is zero: 

6t,(<p) = 0. (1.178) 

We can say that the variation must be “tangent” to the constraint 
surface. Expanding this with the chain rule, a variation r/ is tan- 
gent to the constraint surface <p if 

(diip o T[q\) 77 + ( d 2 ip o T[g]) Dr/ = 0. (1.179) 



89 Given any acceptable variation we may make another acceptable variation by 
multiplying the given one by a bump function that emphasizes any particular 
time interval. 
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Note that these are functions of time; the variation at a given time 
is tangent to the constraint at that time. 

1.10.1 Coordinate Constraints 

Consider constraints that do not depend on velocities: 



82V = 0 . 



In this case the variation is tangent to the constraint surface if 

(dup o r) rj = 0. (1.180) 

Together, equations (1.177) and (1.180) should determine the mo- 
tion, but how do we eliminate ?y? The residual of the Lagrange 
equations is orthogonal 90 to any i] that is orthogonal to the nor- 
mal to the constraint surface. A vector that is orthogonal to all 
vectors orthogonal to a given vector is parallel to the given vec- 
tor. Thus, the residual of Lagrange’s equations is parallel to the 
normal to the constraint surface; the two must be proportional: 

D (d 2 L o T[q}) - di L o T[g] = X(diip) o T[g]. (1.181) 

That the two vectors are parallel everywhere along the path does 
not guarantee that the proportionality factor is the same at each 
moment along the path, so the proportionality factor A is some 
function of time, which may depend on the path under consider- 
ation. These equations, with the constraint equation <p o T[g] = 0, 
are the governing equations. These equations are sufficient to de- 
termine the path q and to eliminate the unknown function A. 

Now watch this 

Suppose we form an augmented Lagrangian treating A as one of 
the coordinates 

L'(t] q, A; q, A) = L(t , q, q) + A <p(t, q, q). (1.182) 

The Lagrange equations associated with the coordinates q are just 
the modified Lagrange equations (1.181), and the Lagrange equa- 



90 We take two tuple- valued functions of time to be orthogonal if at each instant 
the dot product of the tuples is zero. Similarly, tuple-valued functions are 
considered parallel if at each moment one of the tuples is a scalar multiple of 
the other. The scalar multiplier is in general a function of time. 
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tion associated with A is just the constraint equation. (Note that 
A does not appear in the augmented Lagrangian.) So the La- 
grange equations for this augmented Lagrangian fully encapsulate 
the modification to the Lagrange equations that is imposed by the 
addition of an explicit coordinate constraint, at the expense of in- 
troducing extra degrees of freedom. Notice that this Lagrangian is 
of the same form as Lagrangian (1.89) that we used in the deriva- 
tion of L = T — V for rigid systems (section 1.6.2). 

Alternatively 

How do we know that we have enough information to eliminate 
the unknown function A from equations (1.181) or that the ex- 
tra degree of freedom introduced in Lagrangian (1.182) is purely 
formal? 

If A could be written as a function of the solution state path, 
then it would be clear that it is determined by the state and 
can thus be eliminated. Okay, suppose A can be written as a 
composition of state-dependent function with the path: A = A o 
T[q]. Consider the Lagrangian 

L" = L + Acp. (1.183) 

This new Lagrangian has no extra degrees of freedom. The La- 
grange equations for L" are the Lagrange equations for L with 
additional terms arising from the product of A p. Applying the 
Euler-Lagrange operator E (see section 1.9) to this Lagrangian 
gives 

E[L"] = E[L] + E[A^] 

= E [L] + A E [<p\ + E [A] ip + D t A d 2 p + <9 2 A D t tp. (1.184) 

Composition of E[L 7/ ] with L [q] gives the Lagrange equations for 
the path q. Using the fact that the constraint is satisfied on the 
path ip oT[q\ = 0 and consequently D t (p o T[g] = 0, we have 

E [L"\ o F[q] = (E [L] + AE [ip] + D\(d 2 p)) o T[q], (1.185) 



91 Recall that the Euler-Lagrange operator E has the property 
E [FG] = F E[G] + E[F] G + D t F d 2 G + d 2 F D t G. 
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Figure 1.8 We can formulate the behavior of a pendulum as motion 
in the plane, constrained to a circle about the pivot. 

where we have used A = A o T [g] . If we now use the fact that we 
are only dealing with coordinate constraints, = 0 then 

E \L"] o T[ g ] = (E [L] + AEM) o T[q}. (1.186) 

The Lagrange equations are the same as those derived from the 
augmented Lagrangian L' . The difference is that now we see that 
A = A o T[(/] is determined by the unaugmented state. This is the 
same as saying that A can be eliminated. 

Considering only the formal validity of the Lagrange equations 
for the augmented Lagrangian, we could not deduce that A could 
be written as the composition of a state-dependent function A with 
T[q] . The explicit Lagrange equations derived from the augmented 
Lagrangian depend on the accelerations D 2 q as well as A so we 
may not deduce separately that either is the composition of a 
state-dependent function and T [q] . However, now we see that A is 
such a composition. This allows us to deduce that D 2 q is also a 
state-dependent function composed with the path. The evolution 
of the system is determined from the dynamical state. 

The pendulum using constraints 

The pendulum can be formulated as the motion of a massive par- 
ticle in a vertical plane subject to the constraint that the distance 
to the pivot is constant (see figure 1.8). 

In this formulation, the kinetic and potential energies in the 
Lagrangian are those of an unconstrained particle in a uniform 
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gravitational acceleration. A Lagrangian for the unconstrained 
particle is 

L(t- x , y; v x , v y ) = \m{v 2 x + u 2 ) - mgy. (1.187) 

The constraint that the pendulum moves in a circle of radius l 
about the pivot is 92 

x 2 + y 2 — l 2 = 0. (1.188) 

The augmented Lagrangian is 

L'(t;x,y,\;v x ,v y ,\) = \m(v 2 +v 2 ) — mgy + X(x 2 +y 2 — l 2 ). (1.189) 

The Lagrange equations for the augmented Lagrangian are 

mD 2 x — 2Xx = 0 (1.190) 

mD 2 y + mg — 2\y = 0 (1.191) 

x 2 + y 2 — l 2 = 0. (1.192) 

These equations are sufficient to solve for the motion of the pen- 
dulum. 

It should not be surprising that these equations simplify if we 
switch to “polar” coordinates 

x = rsm9 y=—r cos 9. (1.193) 

Substituting this into the constraint equation we determine that 
r = l, a constant. Forming the derivatives and substituting into 
the other two equations we find 

ml(cos9D 2 9 — sin 9(D9) 2 ) — 2Asin0 = 0 (1.194) 

ml(sm9D 2 9 + cos 9(D9) 2 ) + mg + 2Acos 9 = 0. (1.195) 

Multiplying the first by cos 9 and the second by sin 9 and adding, 
we find 

mlD 2 9 + mg sin 9 = 0, (1.196) 



92 This constraint has the same form as the constraints used in the demonstra- 
tion that L = T — V can be used for rigid systems. Here it is a particular 
example of a more general set of constraints. 
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which we recognize as the correct equation for the pendulum. This 
is the same as the Lagrange equation for the pendulum using the 
unconstrained generalized coordinate 6. For completeness, we can 
find A in terms of the other variables 

A = m D x = ~ 7^ (mg cos 6 + ml(D6) 2 ). (1.197) 

This confirms that A is really the composition of a function of the 
state with the state path. Notice that 21 X is a force — it is the 
sum of the outward component of the gravitational force and the 
centrifugal force. Using this interpretation in the two coordinate 
equations of motion we see that the terms involving A are the 
forces that must be applied to the unconstrained particle to make 
it move on the circle required by the constraints. Equivalently, we 
may think of 21 A as the tension in the pendulum rod that holds 
the mass. 93 

Building systems from parts 

The method of using augmented Lagrangians to enforce con- 
straints on dynamical systems provides us with a way of building 
the analysis of a compound system by combining the results of 
the analysis of the parts of the system and the coupling between 
them. 

Consider the compound spring-mass system shown at the top of 
figure 1.9. We could analyze this as a monolithic system with two 
configuration coordinates x\ and X 2 , representing the extensions 
of the springs from their equilibrium lengths X\ and Xi- 

An alternative procedure is to break the system into several 
parts. In our spring-mass system we can choose two parts, one is 
a spring and mass attached to the wall, and the other is a spring 
and mass with its attachment point at an additional configuration 
coordinate £. We can formulate a Lagrangian for each part sepa- 
rately. We can then choose a Lagrangian for the composite system 
as the sum of the two component Lagrangians with a constraint 
£ = X\ + x\ to accomplish the coupling. 



93 Indeed, if we had scaled the constraint equations as we did in the discussion 
of Newtonian constraint forces we could have identified A with the the magni- 
tude of the constraint force F. However, though A will in general be related to 
the constraint forces it will not be one of them. We chose to leave the scaling 
as it naturally appeared rather than make things turn out artificially pretty. 
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Figure 1.9 A compound spring-mass system is decomposed into two 
subsystems. We have two springs and masses that may only move hori- 
zontally. The equilibrium positions of the springs are X\ and X 2 . The 
systems are coupled by the position-coordinate constraint £ = X\ + aq. 



Let’s see how this works. The Lagrangian for the subsystem 
attached to the wall is 

L\(t, xi, xi) = \m\x\ — \k\x\ (1.198) 

and the Lagrangian for the subsystem that attaches to it is 

L 2 (t\ £, x 2 ; i, x 2 ) = \m 2 (i + x 2 ) 2 - \k 2 x%. (1.199) 

We construct a Lagrangian for the system composed from these 
parts as a sum of the Lagrangians for each of the separate parts, 
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with a coupling term to enforce the constraint: 

L(t;x i,x 2 ,£,A;xi,x 2 ,£,A) 

= Li(t, xi,x\) + L 2 (t\ £, x 2 ; i, x 2 ) + A(£ - (Xi + xi)) (1.200) 



Thus we can write Lagrange’s equations for the four configuration 
coordinates, in order, as follows: 



m\D 2 x\ = —k\X\ — A 


(1.201) 


m 2 (D 2 £ > + D 2 x 2 ) = — k 2 x 2 


(1.202) 


m 2 (D 2 £ + D 2 x 2 ) = A 


(1.203) 


0 = £ - {Xi + xi) 


(1.204) 



Notice that in this system A is the force of constraint, holding the 
system together. We can now eliminate the “glue” coordinates 
£ and A to obtain the equations of motion in the coordinates x± 
and x 2 : 

m\D 2 xi + m 2 (D 2 xi + D 2 x 2 ) + k\X\ = 0 (1.205) 

m 2 (D 2 xi + D 2 x 2 ) + k 2 x 2 = 0 (1.206) 

This strategy can be generalized. We can make a library of 
primitive components. Each component may be characterized by 
a Lagrangian with additional degrees of freedom for the terminals 
where that component may be attached to others. We then can 
construct composite Lagrangians by combining components using 
constraints to glue together the terminals. 

Exercise 1.34: Combining Lagrangians 

a. Make another primitive component that is compatible with the spring- 
mass structures described in this section. For example, make a pendu- 
lum that can attach to the spring-mass system. Build a combination 
and derive the equations of motion. Be careful, the algebra is horrible 
if you choose bad coordinates. 

b. For a nice little project, construct a family of compatible mechanical 
parts, characterized by appropriate Lagrangians, that can be combined 
in a variety of ways to make interesting mechanisms. Remember that in 
a good language the result of combining pieces should be a piece of the 
same kind that can be further combined with other pieces. 
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Exercise 1.35: Bead on a triaxial surface 

Consider again the motion of a bead constrained to move on a triaxial 
surface from exercise 1.18. Reformulate this using rectangular coordi- 
nates as the generalized coordinates with an explicit constraint that the 
bead stay on the surface. Find a Lagrangian and show that the Lagrange 
equations are equivalent to those found in exercise 1.18. 

Exercise 1.36: Motion of a tiny golf ball 

Consider the motion of a golf ball idealized as a point mass constrained 
to a frictionless smooth surface of varying height h{x , y) in a uniform 
gravitational field with acceleration g. 

a. Find an augmented Lagrangian for this system, and derive the equa- 
tions governing the motion of the point mass in x and y. 

b. Under what conditions is this approximated by a potential function 
V(x,y) = mgh(x, y)? 

c. Assume that we have an h(x, y) that is axisymmetric about x = y = 
0. Can you find such an h that yields motions with closed orbits? 

1.10.2 Derivative Constraints 

Here we investigate velocity-dependent constraints that are “to- 
tal time derivatives” of velocity independent constraints. The 
methods presented so far do not apply because the constraint is 
velocity-dependent . 

Consider a velocity-dependent constraint ip = 0. That ip is a to- 
tal time derivative means that there exists a velocity-independent 
function ip such that 

^oT[q] = D^oT[q}). (1.207) 

That (p is velocity independent means d^ip = 0. As state functions 
the relationship between ip and ip is 

ip = D t ip = d 0 ip + dupQ. (1.208) 

Given a ip we can find p> by solving this linear partial differential 
equation. The solution is determined up to a constant, so ip = 0 
implies ip = K for some constant K. On the other hand, if we 
knew ip = K then ip = 0 follows. Thus the velocity-dependent 
constraint ip = 0 is equivalent to the velocity-independent con- 
straint ip = K , and we know how to find Lagrange equations for 
such systems. 
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If L is a Lagrangian for the unconstrained problem, the La- 
grange equations with the constraint ip = K are 

(E[L] + AEM)or[,]= 0, (1.209) 

where A is a function of time that will be eliminated during the 
solution process. The constant K does not affect the Lagrange 
equations. The function ip is velocity-independent = 0, so the 
Lagrange equations become 

(E[L] — \d\ip) o T[</] = 0. (1.210) 

From equation (1.208) we see that 

dip = 82 ^, ( 1 . 211 ) 

so the Lagrange equations with the constraint -0 = 0 are 

E[L]oT[q} = Xd 2 ^oT[q\. (1.212) 

The important feature is that we can write the Lagrange equations 
directly in terms of iJj without having to produce the integral ip. 
Of course the validity of these Lagrange equations depends on the 
existence of the integral tp. 

It turns out that the augmented Lagrangian trick also works 
here. These Lagrange equations are given if we augment the La- 
grangian with the constraint if} multiplied by a function of time 
A': 

L' = L + (1.213) 

The Lagrange equations for l! turn out to be 

E [L\ o r[g] = -DX'dpp o T[q], (1.214) 

which, with the identification A = —D X' , are the same as Lagrange 
equations (1.212). 

Sometimes a problem is naturally formulated in terms of velocity- 
dependent constraints. The formalism we have developed will 
handle any velocity-dependent constraint that can be written in 
terms of the derivative of a coordinate constraint. Such a con- 
straint is called an integrable constraint. Any system for which 
the constraints can be put in the form of a coordinate constraint, 
or are already in that form, is called a holonomic system. 
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Figure 1.10 A massive hoop rolling, without slipping, down an in- 
clined plane. 



Exercise 1.37: 

Show that the augmented Lagrangian (1.213) does lead to the Lagrange 
equations (1.214), taking into account the fact that if) is a total time 
derivative of p. 

Goldstein’s hoop 

Here we consider a problem for which the constraint can be rep- 
resented as a time derivative of a coordinate constraint: a hoop 
of mass M rolling, without slipping, down a (one-dimensional) 
inclined plane (see figure 1.10). 94 

We will formulate this problem in terms of the two coordinates 
9, the rotation of an arbitrary point on the hoop from an arbitrary 
reference direction, and x, the linear progress down the inclined 
plane. The constraint is that the hoop does not slip. Thus a 
change in 9 is exactly reflected in a change in x\ the constraint 
function is: 

ip(t',x,0 , ,x,0) = R6 — x (1.215) 

This constraint is phrased as a relation among generalized veloci- 
ties, but it could be integrated to get x = R9 + c. We may form 
our augmented Lagrangian with either the integrated constraint 
or its derivative. 



94 This example appears in [18] pages 49-51, 
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The kinetic energy has two parts, the energy of rotation of the 
hoop and the energy of the motion of its center of mass. 95 The 
potential energy of the hoop decreases as the height decreases. 



Thus we may write the augmented Lagrangian: 

L(t ; x, 6, A; x, 9, A) 

= \MR 2 0 2 + \Mx 2 + Mgx sin (p + \{R9 - x). (1.216) 

Lagrange’s equations are 

MD 2 x — DX = Mgsincp (1.217) 

MR 2 D 2 9 + R DX = 0 (1.218) 

RD6- Dx = 0. (1.219) 

And by differentiation of the third Lagrange equation we obtain, 

D 2 x = RD 2 0. (1.220) 



By combining these equations we can solve for the dynamical 
quantities of interest. For this case of a rolling hoop the linear 
acceleration 

D 2 x = ^gsin cp (1.221) 

is just half of what it would have been if the mass had just slid 
down a frictionless plane without rotating. Note that for this hoop 
D 2 x is independent of both M and R. We see from the Lagrange 
equations that DX can be interpreted as the friction force involved 
in enforcing the constraint. The frictional force of constraint is 



DX = Mg sin ip 


(1.222) 


and the angular acceleration is 




r>2 a 1 9 ■ 

D 6 = 2R Sm ^- 


(1.223) 



95 We will see in chapter 2 how to compute the kinetic energy of rotation, but 
for now the answer is \MR 2 6 2 
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1.10.3 Non-Holonomic Systems 

Systems with constraints that are not integrable are termed non- 
holonomic systems. A constraint is not integrable if it cannot be 
written in terms of an equivalent coordinate constraint. An ex- 
ample of a non-holonomic system is a ball rolling without slipping 
in a bowl. As the ball rolls it must turn so that the surface of the 
ball does not move relative to the bowl at the point of contact. 
This looks like it might establish a relation between the location of 
the ball in the bowl and the orientation of the ball, but it doesn’t. 
The ball may return to the same place in the bowl with different 
orientations depending on the intervening path the ball has taken. 
As a consequence the constraints may not be used to eliminate any 
coordinates. 

What are the equations of motion governing non-holonomic sys- 
tems? For the restricted set of systems with non-holonomic con- 
straints that are linear in the velocities, it is widely reported 96 
that the equations of motion are the following. Let ip have the 
form 

ip(t, q, v) = Gi(t, q)v + G 2 (t, q), (1.224) 

a state function that is linear in the velocities. We assume ip is not 
a total time derivative. If L is a Lagrangian for the unconstrained 
system, then the equations of motion are asserted to be 

E[L] o F[q\ = AGi o T[q] = A d 2 ip o T[q}. (1.225) 

Together with the constraint ip = 0 the system is closed and the 
evolution of the system is determined. Note that these equations 
are identical to the Lagrange equations (1.212) for the case that ip 
is a total time derivative, but here the derivation of those equations 
is no longer valid. 

An essential step in the derivation of the Lagrange equations 
for coordinate constraints ip = 0 with d 2 p> = 0 was to note that 
two conditions must be satisfied 

(E[L]oT[q])ri = 0, (1.226) 



96 For some treatments of non-holonomic systems see, for example, Whit- 
taker [43], Goldstein [18], Gantmakher [17], or Arnold et al. [6]. 
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and 

(dupo r[g ])?7 = 0. (1.227) 

Because E [L] o T [q] is orthogonal to r/, and rj is constrained to be 
orthogonal to ditpoT \q] the two must be parallel at each moment: 

E[L] o r[g] = Xdiip o T[g], (1.228) 

The Lagrange equations for derivative constraints were derived 
from this. 

This derivation does not go through if the constraint function is 
velocity dependent. In this case, for a variation to be consistent 
with the velocity-dependent constraint function if) it must satisfy 
(see equation 1.179) 

(<9iY> o T[q])rj + (d 2 ip ° T[q\)Dr] = 0. (1.229) 

We may no longer eliminate r/ by the same argument, because 7j 
is no longer orthogonal to d\ip o T [q] , and we cannot rewrite the 
constraint as a coordinate constraint because ^ is, by assumption, 
not integrable. 

The following is the derivation of the non-holonomic equations 
from Arnold, et al. ([6]), translated into our notation. Define the 
“virtual velocities” £ to be any velocity satisfying 

(9 2 V>oT[g])£ = 0. (1.230) 

The “principle of d’Alembert-Lagrange,” according to Arnold, 
states that 

(E[L] o T[g])£ = 0, (1.231) 

for any virtual velocity £. Because £ is arbitrary except that it is 
required to be orthogonal to d 2 7poT[q\ and any such £ is orthogonal 
to E[L] o T[g], then ° T[(/] must be parallel to E[L] o T[g], So 

E[L\oT[q} = X(d 2 i’oT[q}), (1.232) 

which are the non-holonomic equations. 

To convert the stationary action equations to the equations of 
Arnold we must do the following. To get from equation (1.226) 
to equation (1.231), we must replace r] by £. However, to get 
from equation (1.229) to equation (1.230), we must set rj = 0 and 
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replace Dr/ by £. All “derivations” of the non-holonomic equa- 
tions have similar identifications. It comes down to this: the non- 
holonomic equations do not follow from the action principle. They 
are something else. Whether they are correct or not depends on 
whether they agree with experiment. 

For systems with coordinate constraints or derivative constraints 
we have found that the Lagrange equations can be derived from 
a Lagrangian that is augmented with the constraint. However, if 
the constraints are not integrable the Lagrange equations for the 
augmented Lagrangian are not the same as the non-holonomic 
system (equations 1.225). 97 Let L' be an augmented Lagrangian 
with non-integrable constraint t/>: 

L'(t ; q, A; q, A) = L(t, q, q) + A ip(t, q, q) (1.233) 

then the Lagrange equations associated with the coordinates are: 
0 = E [L] o T[q\ 

+ L>A(<9 2 '0) ° r [g] + A D((9 2 'i/’) ° T[g]) - A(diV’) o T[q\. (1.234) 

The Lagrange equation associated with A is just the constraint 
equation 

if} o T[g] = 0. (1.235) 

An interesting feature of these equations is that they involve both 
A and D A. Thus the usual state variables q and Dq, with the 
constraint, are not sufficient to determine a full set of initial con- 
ditions for the derived Lagrange equations, we need to specify an 
initial value for A as well. 

In general, for any particular physical system, equations (1.225) 
and (1.234) are not the same, and in fact they have different so- 
lutions. It is not apparent that either set of equations accurately 
models the physical system. The first approach to non-holonomic 
systems is not justified by extension of the arguments for the holo- 
nornic case and the other is not fully determined. Perhaps this is 
an indication that the models are inadequate; that more details 
of how the constraints are maintained need to be specified. 



97 Arnold, et al. [6] call the variational mechanics with the constraints added 
to the Lagrangian Vakonomic mechanics. 
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1.11 Summary 

To analyze a mechanical system we construct an action function 
that gives us a way to distinguish realizable motions from other 
conceivable motions of the system. The action function is con- 
structed so as to be stationary only on paths describing realizable 
motions, with respect to variations of the path. This is called the 
principle of stationary action. The principle of stationary action 
is a coordinate-independent specification of the realizable paths. 
For systems with or without constraints we may choose any sys- 
tem of coordinates that uniquely determines the configuration of 
the system. 

For a large variety of mechanical systems actions are integrals 
of a function, called the Lagrangian, along the path. For many 
systems an appropriate Lagrangian is the difference of the kinetic 
energy and the potential energy of the system. The choice of a 
Lagrangian for a system is not unique. 

For any system that we have a Lagrangian action we can for- 
mulate a system of ordinary differential equations, the Lagrange 
equations, that is satisfied by any realizable path. The method of 
deriving the Lagrange equations from the Lagrangian is indepen- 
dent of the coordinate system used to formulate the Lagrangian. 
One freedom we have in formulation is that the addition of a to- 
tal time derivative to a Lagrangian for a system yields another 
Lagrangian that has the same Lagrange equations. 

The Lagrange equations are a set of ordinary differential equa- 
tions: there is a finite state that summarizes the history of the 
system and is sufficient to determine the future. There is an ef- 
fective procedure for evolving the motion of the system from a 
state at an instant. For many systems the state is determined by 
the coordinates and the rate of change of the coordinates at an 
instant. 

If there are continuous symmetries in a physical system there 
are conserved quantities associated with them. If the system can 
be formulated in such a way that the symmetries are manifest in 
missing coordinates in the Lagrangian then there are conserved 
momenta conjugate to those coordinates. If the Lagrangian is 
independent of time then there is a conserved energy. 
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1.12 Projects 

Exercise 1.38: A numerical investigation 

Consider a pendulum: a mass m supported on a massless rod of length 
l, in a uniform gravitational field. A Lagrangian for the pendulum is: 

L(t, 6, 0) = + mgl cos 0 

For the pendulum, the period of the motion depends on the amplitude. 
We wish to find trajectories of the pendulum with a given frequency. 
Three methods of doing this present themselves: (1) solution by the 
principle of least action, (2) numerical integration of Lagrange’s equa- 
tion, and (3) analytic solution (which requires some exposure to elliptic 
functions). We will carry out all three, and compare the solution trajec- 
tories. 

To be specific, consider the parameters m = 1 kg, l = 1 to, g = 
9.8 ms~ 2 . The frequency of small amplitude oscillations is u>o = \fgjl. 
Let’s find the non-trivial solution that has the frequency = |wo- 

a. The angle is periodic in time, so a Fourier series representation is 
appropriate. We can choose the origin of time so that a zero crossing 
of the angle is at time zero. Since the potential is even in the angle, 
the angle is an odd function of time. Thus we need only a sine series. 
Since the angle returns to zero after one-half period the angle is an odd 
function of time about the midpoint. Thus only odd terms of the series 
are present: 

m 

m = £ A n sin((2?r — l)uqi). 

n= 1 

The amplitude of the trajectory is A = 0 max = l) n+1 ^n- 

Find approximations to the first few coefficients A n by minimizing 
the action. You will have to write a program similar to the find-path 
procedure in section 1.4. Watch out: there is more than one trajectory 
that minimizes the action. 

b. Write a program to numerically integrate Lagrange’s equations for 
the trajectories of the pendulum. The trouble with using numerical 
integration to solve this problem is that we do not know how the fre- 
quency of the motion depends on the initial conditions. So we have to 
guess, aud then gradually improve our guess. Define a function fl(0) 
that numerically computes the frequency of the motion as a function of 
the initial angular velocity (with 0 = 0). Find the trajectory by solving 
fl(0) = cu, for the initial angular velocity of the desired trajectory. Meth- 
ods of solving this equation include successive bisection, minimizing the 
squared residual, etc. — choose one. 
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Figure 1.11 The double pendulum is pinned in two joints so that its 
members are free to move in a plane. 



c. Now let’s formulate the analytic solution for the frequency as a func- 
tion of amplitude. The period of the motion is simply 

, t /4 r A 1 

T = 4 / dt = 4 / -^dO. 

Jo Jo 0 

Using the energy, solve for 6 in terms of the amplitude A and 9 to write 
the required integral explicitly. This integral can be written in terms 
of elliptic functions, but in a sense this does not solve the problem — we 
still have to compute the elliptic functions. Let’s avoid this excursion 
into elliptic functions and just do the integral numerically using the 
procedure definite-integral. We still have the problem that we can 
specify the amplitude A and get the frequency but to solve our problem 
we need to solve the inverse problem, but that can be done as in part b. 

Exercise 1.39: Double pendulum behavior 

Consider the ideal double pendulum show in figure 1.11. 

a. Formulate a Lagrangian to describe the dynamics. Derive the equa- 
tions of motion in terms of the given angles 9\ and 62 ■ Put the equations 
into a form appropriate for numerical integration. 

Assume the following system parameters: 



1 1 = 1.0 m 
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1 2 = 0.9 m 
mi = 1.0 kg 

m 2 = 3.0 kg 



b. Prepare graphs showing the behavior of each angle as a function of 
time when the system is started with the initial conditions: 

9,(0) 

0 2 ( 0 ) 

<M0) 

H o) 

Make the graphs extend to 50 seconds. Save the state points at 
second intervals in a list. 

c. Make a graph of the behavior of the energy of your system 
function of time. The energy should be conserved. How good is the 
conservation you obtained? 

d. Repeat the experiment of part b with the m 2 bob 10“ 10 m higher 
than before. Form the list of squared differences of the distances between 
the m -2 bobs in the two experiments, and plot the log of that against 
time. What do you see? 

e. Repeat the previous comparison, but this time with the initial con- 
ditions: 

9,(0) 

0 2 ( 0 ) 

9,(0) 

0 2 ( 0 ) 



= — radian 
2 

= 0 radian 
radian 



= 0 



sec 

radian 

sec 



.125 
as a 



= — radian 
2 

= 7 r radian 
radian 



= 0 



sec 

radian 

sec 



What do you see here? 




2 

Rigid Bodies 



The polhode rolls without slipping on the 
lrerpolhode lying in the invariable plane. 

Herbert Goldstein Classical Mechanics, (1950), 
footnote on p 161. 



The motion of rigid bodies presents many surprising phenomena. 

Consider the motion of a top. A top is usually thought of as 
an axisymmetric body, subject to gravity, with a point on the 
axis of symmetry that is fixed in space. The top is spun, and in 
general executes some complicated motion. We observe that the 
top usually settles down into an unusual motion in which the axis 
of the top slowly precesses about the vertical, apparently moving 
perpendicular to the direction in which gravity is attempting to 
accelerate it. 

Consider the motion of a book thrown into the air. 1 Books 
have three main axes. Idealized as a brick with rectangular faces, 
the three axes are the lines through the centers of opposite faces. 
Try spinning the book about each axis. The motion of the book 
spun about the longest and the shortest axis is a simple regular 
rotation, perhaps with a little wobble depending on how carefully 
it is thrown. The motion of the book spun about the intermediate 
axis is qualitatively different: however carefully the book is spun 
about the intermediate axis the book tumbles. 

The rotation of the Moon is peculiar in that the Moon always 
presents the same face to the Earth, indicating that the rotational 
period and the orbit period are the same. Considering that the 
orbit of the Moon is constantly changing because of interactions 
with the Sun and other planets, and therefore the orbital period 
of the Moon is constantly undergoing small variations, we might 
expect that the face of the Moon that we see would slowly change, 
but it does not. What is special about the face that is presented 
to us? 



1 We put a rubber band around the book so that it does not open. 
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A rigid body may be thought of as a large number of constituent 
particles with rigid constraints among them. Thus the dynamical 
principles governing the motion of rigid bodies are the same as 
those governing the motion of any other system of particles with 
rigid constraints. What is new here is that the number of con- 
stituent particles is very large and we need to develop new tools 
to handle them effectively. 

We have found that a Lagrangian for a system with rigid con- 
straints can be written as the difference of the kinetic and po- 
tential energies. The kinetic and potential energies are naturally 
expressed in terms of the positions and velocities of the constituent 
particles. To write the Lagrangian in terms of the generalized co- 
ordinates and velocities we must specify functions that relate the 
generalized coordinates to the positions of the constituent parti- 
cles. In the systems with rigid constraints considered up to now 
these functions were explicitly given for each of the constituent 
particles and individually included in the derivation of the La- 
grangian. For a rigid body there are too many consituent particles 
to handle each one of them in this way. We need to find means 
of expressing the kinetic and potential energies of rigid bodies in 
terms of the generalized coordinates and velocities, without going 
through the particle-by-particle details. 

The strategy is to first rewrite the kinetic and potential energies 
in terms of quantities that characterize essential aspects of the 
distribution of mass in the body and the state of motion of the 
body. Only later do we introduce generalized coordinates. For 
the kinetic energy, it turns out a small number of parameters 
completely specify the state of motion and the relevant aspects 
of the distribution of mass in the body. For the potential energy, 
we find that for some specific problems the potential energy can 
be represented with a small number of parameters, but in general 
we have to make approximations to obtain a representation with 
a manageable number of parameters. 



2.1 Rotational Kinetic Energy 

We consider a rigid body to be made up of a large number of 
constituent particles with mass rn Q , position x a , and velocities 




2.1 Rotational Kinetic Energy 



115 



x a , with rigid positional constraints among them. The kinetic 
energy is 

y \m a x a ■ x a . (2.1) 

a 

It turns out that the kinetic energy of a rigid body can be sepa- 
rated into two pieces: a kinetic energy of translation and a kinetic 
energy of rotation. Let’s see how this comes about. 

The configuration of a rigid body is fully specified given the 
location of any point in the body and the orientation of the body. 
This suggests that it would be useful to decompose the position 
vectors for the constituent particles as the sum of the vector X 
to some reference position in the body and the vector from the 
reference position to the particular constituent element with index 
a: 



x a — X + (2-2) 

Along paths, the velocities are related by 

x Q = X + £ a . (2.3) 

So in terms of X and £ a the kinetic energy is 

(a + 4) • (j£ + £,) 

a 

= y \m a [k-k + 2X-i y + £ a . 'Q . (2.4) 

a 



If we select the reference position in the body to be its center of 
mass , 






(2.5) 



where M = Y^ a total mass of the body, then 

y niaia = y rn a (x a -X) = 0. 

a 



a 



(2.6) 
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So along paths the relative velocities satisfy 

a 

The kinetic energy is then 

-X + Y^a ■ £cr ( 2 -8) 

a 

The kinetic energy is the sum of the kinetic energy of the motion 
of the total mass at the center of mass 

\MX • X, (2.9) 

and the kinetic energy of rotation about the center of mass 

2 m oia. ' icc ( 2 - 10 ) 



Written in terms of appropriate generalized coordinates the ki- 
netic energy is a Lagrangian for a free rigid body. If we choose 
generalized coordinates so that the center of mass position is en- 
tirely specified by some of them and the orientation is entirely 
specified by others, then the Lagrange equations for a free rigid 
body will decouple into two groups of equations, one concerned 
with the motion of the center of mass and one concerned with the 
orientation. 

Such a separation might occur in other problems, such as a 
rigid body moving in a uniform gravitational field, but in general, 
potential energies cannot be separated as the kinetic energy sep- 
arates. So the motion of the center of mass and the rotational 
motion are usually coupled through the potential. Even in these 
cases, it is usually an advantage to choose generalized coordinates 
that separately specify the position of the center of mass and the 
orientation. 



2.2 Kinematics of Rotation 

The motion of a rigid body about a center of rotation, a reference 
position that is fixed with respect to the body, is characterized 
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at each moment by a rotation axis and a rate of rotation. Let’s 
elaborate. 

We can get from any orientation of a body to any other orien- 
tation of the body by a rotation of the body. That this is true is 
called Euler’s theorem . 2 We know that rotations have the prop- 
erty that they do not commute: the composition of successive 
rotations in general depends on the order of operation. Rotating 
a book about the x axis and then about the z axis puts the book 
in a different orientation than rotating the book about the z axis 
and then about the x axis. Nevertheless, Euler’s theorem states 
that however many rotations have been composed to reach a given 
orientation, the orientation could have been reached with a single 
rotation. Try it! We take a book, rotate it this way, then that, 
and then some other way — then find the rotation that does the job 
in one step. So a rotation can be specified by an axis of rotation 
and the angular amount of the rotation. 

If the orientation of a body evolves over some interval of time 
then the orientation at the beginning and the end of the interval 
can be connected by a single rotation. In the limit that the du- 
ration of the interval goes to zero the rotation axis approaches a 
unique instantaneous rotation axis. And in this limit the ratio of 
the angle of rotation and the duration of the interval approaches 
the instantaneous rate of rotation. We represent this instanta- 
neous rotational motion by the angular velocity vector u ;, which 
points in the direction of the rotation axis (with the right-hand 
rule giving the direction of rotation about the axis) and has a 
magnitude equal to the rate of rotation. 

If the angular velocity vector for a body is Q then the velocities 
of the constituent particles are perpendicular to the vectors to 
the constituent particles and proportional to the rate of rotation 
of the body and the distance of the constituent particle from the 
instantaneous rotation axis: 

L = a x ( 2 - 11 ) 

Isn’t it interesting that we have found a concise way of specify- 
ing how the orientation of the body is changing, even though we 
have not yet described a way to specify the orientation itself. 



2 For an elementary geometric proof of Euler’s theorem see Whittaker [43]. 
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2.3 Moments of Inertia 

The rotational kinetic energy is the sum of the kinetic energy of 
each of the constituents of the rigid body. We can rewrite the 
rotational kinetic energy in terms of the angular velocity vector 
and certain aggregate quantities determined by the distribution 
of mass in the rigid body. 

Substituting our representation of the relative velocity vectors 
into the rotational kinetic energy we obtain 

Y • fa = E a X £*) • (w X f a ) • (2-12) 

a 

We introduce an arbitrary rectangular coordinate system with ori- 
gin at the center of rotation and with basis vectors eo, e\, and £ 2 , 
with the property that eo x e\ = e- 2 - The components of uj on this 
coordinate system are lo°, uj 1 , and uj 2 . Rewriting uj in terms of its 
components, the rotational kinetic energy becomes 



Y ((E* e* w *) x f«) ' ((Ej x f a ) 

a 



= 5 Eij wV Ea (£» X fa) • (ej x fa) 




= 2 


(2.13) 


with 




Iij = ^ ^ 'W'QL {&i ^ Ca) * {&j ^ £,a) • 


(2.14) 



a 



The quantities I l3 are the components of the inertia tensor with 
respect to the chosen coordinate system. Note what a remarkable 
form the kinetic energy has taken. All we have done is interchange 
the order of summations, but now the kinetic energy is written as 
a sum of products of components of the angular velocity vector, 
which completely specify how the orientation of the body is chang- 
ing, and the quantity Iij , which depends solely on the distribution 
of mass in the body relative to the chosen coordinate system. 

We will deduce a number of properties of the inertia tensor. 
First, we find a somewhat simpler expression for it. The cornpo- 
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nents of the vector are {Ca,VaXa)- 3 Rewriting as a sum 
over its components, and simplifying the elementary vector prod- 
ucts of basis vectors, the components of the inertia tensor can be 
arranged in the inertia matrix I, which looks like: 



'J2a m »(vi + C) 
- Eo maVaZa 
- — Ea m ceCa£,a 



~ Eo rn at,aVa 
J2a m Ml + C l) 
- Eo maCaTIa 



Ea m aCaCa 
- E a r n a ria(a 



(2.15) 



The inertia tensor has real components and is symmetric: Ijk = 
Ikj- 

We define the moment of inertia I about a line by 



1 = 

OL 



(2.16) 



where is the perpendicular distance from the line to the con- 
stituent with index a. The diagonal components of the inertia 
tensor are recognized as the moments of inertia about the lines 
coinciding with the coordinate axes e*. The off-diagonal compo- 
nents of the inertia tensor are called products of inertia. 

The rotational kinetic energy of a body depends on the distri- 
bution of mass of the body solely through the inertia tensor. Re- 
markably, the inertia tensor involves only second order moments 
of the mass distribution with respect to the center of mass. We 
might have expected the kinetic energy to depend in a complicated 
way on all the moments of the mass distribution, interwoven in 
some complicated way with the components of the angular ve- 
locity vector, but this is not the case. This fact has a remarkable 
consequence: for the motion of a free rigid body the detailed shape 
of the body does not matter. If a book and a banana have the 
same inertia tensor, that is, the same second order mass moments, 
then if they are thrown in the same way the subsequent motion 
will be the same, however complicated that motion is. The fact 
that the book has corners and the banana has a stem do not affect 
the motion except for their contributions to the inertia tensor. In 
general, the potential energy of an extended body is not so simple 



3 Here we avoid the more consistent notation (£° , fa, ■fa) f° r the components 
of because it is awkward to write expressions involving powers of the com- 
ponents written this way. 
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and does indeed depend on all moments of the mass distribution, 
but for the kinetic energy the second moments are all that matter! 

Exercise 2.1: Rotational kinetic energy 

An interesting alternate form for the rotational kinetic energy can be 
found by decomposing £ Q into components parallel and perpendicular 
to the rotation axis ta. Show that the rotational kinetic energy can also 
be written 

T k = \Iu 2 , (2.17) 

where / is the moment of inertia about the line through the center of 
mass with direction ui, and u> is the instantaneous rate of rotation. 

Exercise 2.2: Steiner’s theorem 

Let I be the moment of inertia of a body with respect to some given line 
through the center of mass. Show that the moment of inertia I' with 
respect to a second line parallel to the first is 

I' = 1 + MR 2 (2.18) 

where M is the total mass of the body and R is the distance between 
the lines. 

Exercise 2.3: Some useful moments of inertia 

Show that the moments of inertia of the following objects are as given: 

a. The moment of inertia of a sphere of uniform density with mass M 
and radius R about any line through the center is | MR 2 . 

b. The moment of inertia of a spherical shell with mass M and radius 
R about any line through the center is | MR 2 . 

c. The moment of inertia of a cylinder of uniform density with mass M 
and radius R about the axis of the cylinder is \ M R 2 . 

c. The moment of inertia of a thin rod of uniform density per unit 
length with mass M and length L about an axis perpendicular to the 
rod through the center of mass is j^ML 2 . 

Exercise 2.4: Jupiter 

a. The density of a planet increases toward the center. Provide an 
argument that the moment of inertia is less than that of a sphere of 
uniform density of the same mass and radius. 

b. The density as a function of radius inside Jupiter is well approxi- 
mated by 

M sm(nr/R) 

'’ (r) = jP 4 r/R ’ 
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where M is the mass and R is the radius of Jupiter. Find the moment 
of inertia of Jupiter in terms of M and R. 



2.4 Inertia Tensor 

The representation of the rotational kinetic energy in terms of the 
inertia tensor was derived with the help of a rectangular coordi- 
nate system with basis vectors e t . There was nothing special about 
this particular rectangular basis. So, the kinetic energy must have 
the same form in any rectangular coordinate system. We can use 
this fact to derive how the inertia tensor changes if the body or 
the coordinate system is rotated. 

Let’s talk a bit about active and passive rotations. The rotation 
of the vector x by the rotation R produces a new vector x' = Rx. 
We may write x in terms of its components with respect to some 
arbitrary rectangular coordinate system with orthonormal basis 
vectors ep. x = x°eo + x 1 e\ + x 2 e2 ■ Let x indicate the column 
matrix of components x°, x 1 , and x 2 of x, and R be the matrix 
representation of R with respect to the same basis. In these terms 
rotation can be written x r = Rx. The rotation matrix R is a real 
orthogonal matrix. A rotation that carries vectors to new vectors 
is called an active rotation. 

Alternately, we can rotate the coordinate system by rotating the 
basis vectors, but leave other vectors that might be represented 
in terms of them unchanged. If a vector is unchanged but the 
basis vectors are rotated then the components of the vector on 
the rotated basis vectors are not the same as the components 
on the original basis vectors. Denote the rotated basis vectors 
by ei = Rii. The component of a vector along a basis vector 
is the dot product of the vector with the basis vector. So the 
components of the vector x along the rotated basis e' are (a/)* = 
x-e'i = x- ( Re-i ) = ( R~ l x ) -ej. 5 So the components with respect to 
the rotated basis elements are the same as the components of the 
rotated vector R^ 1 ^ with respect to the original basis. In terms 
of components, if the vector x has components x with respect to 
the original basis vectors e t , then the components x ; of the same 



4 An orthogonal matrix R satisfies R T = R 1 and detR = 1. 

*The last equality follows from the fact that the rotation of two vectors pre- 
serves the dot product: x ■ y = (Rx) • (Ry), or (7? 4 af) ■ y = x ■ ( Ry ). 




